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Abstract 


This  report  deals  with  the  problem  of  making  a  computer  reason  about  the  interactions 
between  knowledge  and  action.  In  particular,  we  want  to  be  able  to  reason  about  what 
knowledge  a  person  must  have  in  order  to  perform  an  action,  and  what  knowledge  a  person 
may  gain  by  performing  an  action.  The  first  problem  we  >ace  in  achieving  this  goal  is  that 
the  basic  facts  about  knowledge  which  we  need  to  use  are  most  naturally  expressed  as  a 
modal  logic  There  are,  however,  no  known  techniques  for  efficiently  doing  automatic 
deduction  directly  in  modal  logics.  We  solve  this  problem  by  taking  the  possible-world 
semantics  for  a  modal  logic  of  knowledge  and  axlomatizing  it  directly  in  first-order  logic 
This  means  that  we  reason  not  about  what  facts  someone  knows,  but  rather  what  possible 
worlds  are  compatible  with  what  he  knows.  We  integrate  this  theory  with  a  logic  of  actions 
by  identifying  possible  worlds  with  the  situations  before  and  after  an  action  is  performed. 
We  use  these  notions  to  express  what  knowledge  a  person  must  have  in  order  to  perform  a 
given  action  and  what  knowledge  a  person  acquires  by  carrying  out  a  given  action.  Finally, 
we  consider  some  domain-specific  control  heuristics  that  are  useful  for  doing  deductions  in 
this  formalism,  and  we  present  several  examples  of  deductions  produced  by  applying  these 
heuristics. 

This  report  is  a  slightly  revised  version  of  a  thesis  submitted  to  the  Department  of 
Electrical  Engineering  and  Computer  Science  of  the  Massachusetts  Institute  of  Technology 
on  February  9,  1979,  in  partial  fulfillment  of  the  requirements  for  the  degree  of  Doctor  of 
Philosophy. 
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1.  Introduction 

1.1  The  Importance  of  Knowledge  in  Reasoning  about  Action 

Planning  sequences  of  actions  and  reasoning  about  the  effects  of  actions  is  one  of  the 
areas  which  has  received  the  most  attention  from  researchers  in  artificial  intelligence  (Al). 
Systems  such  as  SHRDLU  (Winograd,  1971),  STRIPS  (Fikes  and  Nilsson,  1971),  BUILD 
(Fahlman,  1973),  HACKER  (Sussman,  1973).  and  NOAH  (Sacerdoti.  1977)  have  explored 
issues  including  plan  generation  and  debugging,  representing  changes  to  a  world,  skill 
acquisition,  resolving  conflicting  goals,  and  hierarchical  plan  refinement  To  date,  however, 
little  attention  has  been  paid  to  the  important  role  that  the  agent’s  knowledge  plays  in 
planning  and  acting  to  achieve  a  goal. 

Almost  all  A I  planning  systems  assume  that  they  have  complete  knowledge  of  all 
relevant  aspects  of  the  problem  domain  and  problem  situation  in  which  they  operate. 
Often,  any  statement  which  cannot  be  inferred  to  be  true  is  assumed  to  be  false.  In  the  real 
world,  however,  planning  and  acting  must  frequently  be  performed  without  complete 
knowledge  of  the  situation.  This  imposes  two  additional  burdens  on  an  intelligent  agent 
trying  to  act  effectively.  First,  when  the  agent  entertains  a  plan  for  achieving  some  goal,  he 
must  consider  not  only  whether  the  physical  prerequisites  of  the  plan  are  satisfied,  but  also 
whether  he  has  all  the  information  necessary  to  carry  out  the  plan.  Second,  he  must  be  able 
to  reason  about  what  he  can  do  to  obtain  the  necessary  information  that  he  currently  lacks. 

Consider  the  problem  of  trying  to  open  a  safe.  Typically,  A I  systems  assume  that  if 
there  is  an  action  that  an  agent  is  physically  able  to  perform,  and  that  action  results  in  some 
proposition  P  being  true,  then  the  agent  can  achieve  P.  In  the  case  of  opening  a  Safe,  there 
is  certainly  some  action  that  any  human  agent  of  normal  abilities  is  physically  able  to 
perform  and  that  will  result  in  the  safe  being  open,  namely,  dialing  the  combination  of  the 
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safe.  It  would  be  highly  misleading,  however,  to  claim  that  an  agent  could  open  the  safe 
simply  by  dialing  the  combination  unless  he  knew  what  the  combination  of  the  safe  was.  If, 
on  the  other  hand,  he  had  a  piece  of  paper  that  had  the  combination  of  the  safe  written  on 
it,  he  could  open  the  safe  by  reading  what  was  on  the  piece  of  paper  and  then  dialing  the 
combination  of  the  safe,  even  if  h<  did  not  know  the  combination  initially. 

What  we  seek  are  techniques  for  creating  computer  systems  capable  of  drawing 
conclusions  such  as  this  based  on  a  general  understanding  of  the  relationship  between 
knowledge  and  action.  The  question  of  generality  is  somewhat  problematical,  since  different 
actions  obviously  have  different  prerequisites  and  results  that  involve  knowledge.  To  make 
this  issue  concrete,  consider  trying  to  take  knowledge  into  account  in  the  STRIPS  approach 
to  the  representation  of  actions.  In  this  approach,  knowledge  about  an  action  is  represented 
by  three  lists:  a  list  of  preconditions  that  must  be  satisfied  for  the  action  to  be  applicable,  a 
list  of  deletions  which  might  not  be  true  any  longer  after  the  action  has  been  performed, 
and  a  list  of  additions  which  become  true  as  a  result  of  the  action  being  performed,  for 
instance,  the  action  of  pushing  an  object  from  one  location  to  another  is  represented  by  the 
following  schema  (Fikes  and  Nilsson,  1971,  p.  201): 

Push(k,m,n):  Robot  pushes  object  k  from  place  m  to  place  n. 

Preconditions}  At(k,m),  Atr(m) 

Deletions:  Atr(m),  A1(k,m) 

Additions:  Atr(n),  At(M) 

The  interpretation  of  At(k,m)  is  that  object  k  is  at  place  m,  and  the  interpretation  of 
Atr(m)  is  that  the  robot  is  at  place  m.  Thus,  the  interpretation  of  the  entire  schema  is  that 
for  the  robot  to  push  an  object  from  one  place  to  another,  the  robot  and  the  object  must 
both  be  in  the  first  place,  and  after  the  robot  pushes  the  object,  the  robot  and  the  object  are 
no  longer  in  the  first  place,  but  are  now  in  the  second  place. 

The  problems  that  we  will  point  out  in  trying  to  represent  facts  about  the  interaction  of 


knowledge  and  action  in  a  STRIPS-like  formalism  are  not  unique  to  that  system.  Similar 
difficulties  would  arise  in  trying  to  extend  any  of  the  systems  mentioned  above  to  take 
knowledge  into  account.  While  the  more  recent  systems  use  more  sophisticated  planning 
techniques  than  STRIPS  does,  their  representations  of  the  effects  of  actions  are  roughly 
equivalent  to  that  used  in  STRIPS. 

If  we  want  to  represent  an  action  like  dialing  the  combination  of  a  safe,  the  obvious 
thing  to  do  (and  essentially  the  only  thing  that  can  be  done  within  the  STRIPS  approach) 
would  be  to  have  one  of  the  preconditions  be  that  the  agent  knows  the  combination  of  the 
safe.  This  is  much  more  specific  than  it  needs  to  be,  however.  Doing  things  this  way  fails 
to  suggest  any  connection  between  the  fact  that  dialing  the  combination  of  the  safe  requires 
knowing  the  combination  and  the  fact  that  calling  someone  on  the  telephone  requires 
knowing  his  phone  number  or  the  fact  that  pushing  a  block  to  a  certain  location  requires 
knowing  where  that  location  is. 

What  all  these  examples  have  in  common  is  that  being  able  to  use  any  action  to  achieve 
a  goal  requires  knowing  what  action  to  take.  From  this  point  of  view,  knowing  the 
combination  of  a  safe  is  not  really  a  precondition  for  dialing  the  combination  of  the  safe; 
rather,  it  is  required  for  knowing  what  action  dialing  the  combination  of  the  safe  is. 
Similarly,  knowing  what  action  constitutes  calling  someone  on  the  telephone  normally 
requires  knowing  his  phone  number,  and  knowing  what  action  constitutes  pushing 
something  to  a  certain  location  requires  knowing  where  that  location  is. 

Now,  we  propose  that  for  a  general  action  like  dialing  combinations  of  safes,  if  an  agent 
knows  what  the  action  is  (i.e.,  he  knows  how  to  dial  combinations  of  safes  "in  general"),  then 
he  knows  what  some  specific  instance  of  the  action  is  (e.g.  dialing  combination  Cj  on  safe 

Sf  j)  if  he  knows  what  objects  the  action  is  being  applied  to  in  that  instance.  So  knowing 
what  combination  Ci  is  and  knowing  what  safe  Sfj  is  would  be  sufficient  for  knowing  what 


action  dialing  Cj  on  Sf j  is.  On  the  other  hand,  an  agent  could  know  that  dialing  the 
combination  of  Sf]  on  $f|  will  result  in  Sfj  being  open,  but  not  be  able  to  open  Sf  j  because 
he  doesn’t  know  what  combination  the  description  “combination  of  $f)“  refers  to,  and 
therefore  doesn’t  know  what  action  constitutes  dialing  the  combination  of  Sfj.  A  similar 
analysis  applies  to  the  examples  of  calling  someone  on  the  telephone  or  pushing  a  block  to  a 
certain  location,  so  we  have  one  general  principle  that  covers  all  the  examples,  rather  than 
different  knowledge  preconditions  for  each  case. 

Adequately  representing  the  effects  of  actions  on  knowledge  also  goes  beyond  what  can 
easily  be  represented  using  the  STRIPS  approach.  This  might  seem  to  be  rather  straight¬ 
forward.  If  we  have  an  information  gathering  operation,  like  looking  into  a  box,  we  could 
simply  put  on  the  list  of  additions  for  the  action  that  the  agent  knows  what  is  in  the  box 
and  put  on  the  list  of  deletions  that  he  does  not  know  what  is  in  the  box.  This  might  be  all 
right  for  gains  in  knowledge  by  direct  observation,  but  there  are  more  subtle  problems  that 
it  overlooks. 

Consider  the  notion  of  a  test.  The  essence  of  a  test  is  that  it  is  an  action  that  has  a 
directly  observable  result  that  depends  conditionally  on  an  unobservable  precondition.  In 
the  use  of  litmus  paper  to  test  the  pH  of  a  solution,  the  observable  result  is  whether  the 
paper  is  red  or  blue,  and  the  unobservable  precondition  is  whether  the  solution  is  acid  or 
alkaline.  What  makes  such  a  test  useful  for  acquiring  knowledge  is  that  the  agent  can  infer 
from  his  knowledge  of  the  behavior  of  litmus  paper  and  the  observed  color  of  the  paper 
whether  the  solution  is  acid  or  alkaline.  In  a  test  it  is  usually  this  inferred  knowledge, 
rather  than  what  is  directly  observed,  that  is  important  After  all,  the  color  of  a  piece  of 
litmus  paper  is  seldom  an  intrinsically  interesting  piece  of  information. 

If  we  follow  the  previous  suggestion  in  trying  to  formulate  a  STRIPS  operator  for  using 
litmus  paper,  we  will  have  to  include  the  result  that  the  agent  knows  whether  the  solution  is 
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acid  or  alkaline  as  a  separate  fact  from  the  result  that  he  knows  the  color  of  the  paper.  If 
we  do  this,  however,  we  completely  miss  the  point  that  the  knowledge  of  the  pH  of  the 
solution  is  inferred  from  other  knowledge,  rather  than  being  a  direct  observation. 
Moreover,  we  are  in  effect  specifying  what  actions  constitute  possible  tests,  rather  than 
creating  a  system  that  is  able  to  infer  what  actions  are  possible  tests. 

If  we  want  to  capture  the  inference  that  the  agent  must  make  to  use  a  test,  we  have  to 
represent  several  independent  pieces  of  knowledge  that  the  agent  must  have.  Obviously,  we 
have  to  represent  that  after  the  test  is  performed,  the  agent  knows  the  observable  result 
This  much  is  handled  by  the  STRIPS  approach.  Furthermore,  we  have  to  represent  the 
fact  that  he  knows  that  the  test  has  been  performed.  If  he  just  walks  into  the  room  and  sees 
the  litmus  paper  on  the  table,  he  will  know  what  color  it  is,  but  unless  he  knows  its  recent 
history,  he  won't  have  gained  any  knowledge  about  the  acidity  of  the  solution. 
Representing  this  knowledge  in  a  principled  way  is  a  problem  for  the  STRIPS  approach. 
The  formulas  on  the  lists  of  additions  and  deletions  are  taken  to  be  true  or  false  at  a 
particular  time,  without  reference  to  other  times.  We  could  introduce  an  ad  hoc  predicate 
on  actions,  Has-Just-Occurred(x),  but  there  is  no  way  to  relate  this  predicate  to  the  notion  of 
time  implicit  in  the  distinction  between  preconditions  and  postconditions  (additions  and 
deletions)  in  the  descriptions  of  actions. 

We  also  need  to  represent  the  fact  that  the  agent  understands  how  the  test  works;  that  is, 
he  knows  how  the  observable  result  of  the  action  depends  on  the  unobservable 
precondition.  Even  if  he  sees  the  litmus  paper  put  into  the  solution,  and  he  sees  the  paper 
change  color,  he  still  won't  know  whether  the  solution  is  acid  or  alkaline,  unless  he  knows 
how  the  color  of  the  paper  is  related  to  the  acidity  of  the  solution.  This  knowledge  is,  in 
fact,  just  what  would  be  expressed  by  the  STRIPS  description  of  the  action.  This  creates 
another  problem  for  the  STRIPS  approach,  since  descriptions  of  actions  are  not  part  of  the 
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language  that  preconditions  and  postconditions  are  written  in.  If  knowing  how  the  physical 
preconditions  of  an  action  affect  the  physical  results  of  an  action  is  a  precondition  to  using 
the  action  as  a  test,  then  the  language  in  which  preconditions  are  written  must  be  able  to 
describe  at  least  the  physical  effects  of  the  action. 

Finally,  the  system  must  be  able  to  reason  that  if  the  agent  knows  (i)  that  the  test  took 
place,  (ii)  the  observable  result  of  the  test,  and  (iii)  how  the  observable  result  depends  on 
the  unobservable  precondition,  then  he  knows  the  unobservable  precondition.  Thus  the 
system  must  incorporate  a  logic  of  knowledge  to  tell  it  when  someone's  knowing  a  certain 
collection  of  facts  implies  that  he  knows  other  facts. 

From  the  preceding  discussion,  we  can  conclude  that  any  system  that  is  capable  of 
reasoning  about  tests  at  this  level  of  detail  must  be  able  to  explicitly  represent  facts  of  the 
following  types: 

(1)  A  knows  that  Q  will  be  true  after  he  performs  Ad  just  in  case  P  is  true  now. 

(2)  After  A  performs  Ad,  he  knows  that  he  has  just  performed  Act 

(3)  After  A  performs  Ad,  he  knows  whether  Q  is  true. 

In  order  to  reason  that  an  agent  can  use  a  certain  test  to  find  out  a  piece  of  information  the 
system  must  also  embody  or  be  able  to  represent  general  principles  sufficient  to  conclude 

(4)  If  (I).  (2),  and  (3)  are  true,  then  after  performing  Ad,  A  will  know  whether  P  was 
true  before  the  action  was  performed. 

(5)  If  A  knows  that  it  is  possible  for  him  to  achieve  P  by  performing  Act  and  he 
knows  what  action  Ad  is,  then  he  can  achieve  P  by  performing  Act. 

(6)  If  Ad  is  a  specific  instance  of  a  class  of  actions  that  A  knows  how  to  perform  in 
general,  then  he  knows  what  action  Ad  is  just  in  case  he  knows  what  the 
arguments  of  Act  refer  to. 

It  is  Important  to  emphasise  that  for  any  work  on  these  problems  to  be  of  real  value  it 


must  seek  general  principles.  For  instance,  it  would  be  possible  to  represent  (I),  (2),  and  ($) 
in  an  arbitrary  ad  hoc  way  and  add  an  axiom  which  explicitly  states  (4),  thereby 
"capturing"  the  notion  of  a  test  Such  an  approach,  however,  would  simply  restate  the 
observations  that  we  have  made  in  this  section.  Our  goal  in  this  thesis  will  be  to  create  a 
system  in  which  specific  facts  like  (4)  follow  from  the  most  basic  principles  of  reasoning 
about  knowledge  and  action. 

There  has  been  little  previous  work  in  AI  on  these  problems.  McCarthy  and  Hayes 
(1969)  were  the  first  AI  workers  to  take  note  of  the  problem  of  actions  with  knowledge 
preconditions.  Their  proposed  solution  is  somewhat  sketchy,  and  it  seems  to  have  some 
problems.  They  first  present  a  set  of  axioms  expressed  in  the  situation  calculus  that  can  be 
used  to  deduce  that  dialing  the  combination  of  a  safe  will  result  in  the  safe  being  open. 
They  then  point  out  that  this  procedure  may  be  infeasible  for  an  agent,  because  he  may  not 
know  the  combination  of  the  safe.  Next  they  introduce  the  expression  idea-of- 
combirution(p,sf,t)  to  mean  person  p's  idea  of  the  combination  of  the  safe  tl  in  situation  a, 
and  suggest,  but  do  not  formalize,  that  it  would  be  feasible  for  anyone  to  dial  his  idea  of 
the  combination  of  the  safe,  since  he  presumably  does  know  that  Given  this,  if  it  can  be 
shown  that  p's  idea  of  the  combination  of  the  safe  is,  in  fact,  the  combination  of  the  safe, 
then  dialing  the  combination  of  the  safe  is  both  feasible  for  p  and  effective  in  opening  the 
safe,  so  it  is  possible  for  p  to  open  the  safe. 

The  requirement  that  the  action  be  feasible  for  the  agent  seems  too  weak,  however. 
Suppose  that  1 IL-22R-33L  is  the  combination  of  the  safe.  If  this  is  true,  then  dialing  I IL- 
22R-33L  will  result  in  the  safe  being  open.  Furthermore,  dialing  IIL-22R-33L  will  be 
feasible  for  anyone  who  understands  in  general  how  to  dial  combinations  of  safes.  But 
McCarthy  and  Hayes’s  argument  would  lead  us  to  infer  that  any  such  person  could  open 
the  safe,  whether  or  not  he  knew  the  combination.  The  central  role  of  knowing  the 
combination  has  been  missed. 
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Another  problem  with  McCarthy  and  Hayes's  approach  it  the  ad  hoc  character  of  the 
idea-of-combination  function.  The  problem  is  that  the  logic  does  not  make  any  special 
connection  between  kk  a*of -combination  and  combination,  the  function  which  picks  out  the 
actual  combination  of  a  safe.  Therefore,  for  each  term  in  the  language  that  we  wanted  to 
talk  about  someone’s  knowledge  of,  we  would  have  to  introduce  a  separate  idaa-of~  function. 
McCarthy  and  Hayes  acknowledged  the  clumsiness  of  this  approach,  but  saw  no  other  way 
of  preserving  the  property  of  referential  transparency,  the  ability  to  substitute  equals  for 
equals.  They  would  have  preferred  to  use  a  general  idea-of  function,  such  that  idea- 
of(p^ombination-o((s(),c)  would  refer  to  p's  idea  in  •  of  the  combination  of  sf.  The  trouble  is 
that  if  equals  can  be  substituted  for  equals,  and  the  combination  of  sf|  is  the  same  as  the 
combination  of  sf2.  then  this  would  imply  that  p's  idea  of  the  combination  of  «f|  would 
have  to  be  the  same  as  p’s  idea  of  the  combination  of  sf2.  which  is  not  necessarily  the  case. 
We  will  present  a  much  more  elegant  solution  to  this  problem  in  section  2.5.  More  recently, 
McCarthy  (1979)  has  also  taken  a  different  approach  to  this  problem  which  we  will  discuss 
in  section  2.6. 

The  only  other  work  in  AI  that  deals  explicitly  with  the  interaction  of  knowledge  and 
action  seems  to  be  that  of  Cohen  (1978).  Cohen's  formalism  is  a  straightforward  encoding 
of  the  STRIPS  approach  into  semantic  network  notation,  with  all  the  limitations  that  we 
have  pointed  out.  Cohen  never  faces  any  of  the  issues  we  have  raised,  because  he  does  not 
really  deal  with  the  problems  of  reasoning  about  knowledge.  His  system  generates  plans 
that  have  effects  on  what  people  know  and  that  require  knowledge  to  execute,  but  all 
statements  about  knowledge  must  be  explicitly  asserted;  the  system  has  no  ability  to  infer 


them. 


1.2  Overview  of  the  Thesis 


This  thesis  attacks  the  problems  of  representing  the  kinds  of  facts  and  making  the  kinds 
of  inferences  described  in  the  previous  section.  First  we  will  discuss  the  representation 
problems.  Then  we  wilt  describe  a  formalism  that  captures  the  distinctions  we  need  to  make 
and  permits  reasonably  efficient  automatic  inferendng.  Then  we  will  outline  a  system  (as 
yet  unimplimented)  for  automatically  carrying  out  deductions  in  this  formalism  and 
illustrate  its  operation  with  several  hand-simulated  examples.  We  will  organize  the 
presentation  around  a  set  of  examples  having  to  do  with  dialing  combinations  and  opening 
safes.  Starting  from  a  set  of  premises  that  respect  the  generalizations  we  have  discussed,  we 
will  show  how  to  automatically  deduce  that: 

(1)  If  John  is  at  the  same  place  as  the  safe  $f  j,  and  he  knows  the  combination  of  the 
safe,  he  can  open  the  safe  by  dialing  the  combination. 

(2)  If  C]  is  the  combination  of  $(],  and  if  John  tries  to  open  Sfj  by  dialing  Cj,  he 
will  then  know  that  C|  is  the  combination  of  $fj. 

(3)  If  John  is  at  the  same  place  as  the  Sfj  and  the  piece  of  paper  Pprj,  and  he 
knows  that  the  combination  of  Sf  |  is  the  only  thing  written  on  P prj,  he  can  open 
Sfj  by  reading  the  piece  of  paper  and  dialing  the  combination. 

The  first  of  these  examples  involves  understanding  what  knowledge  is  sufficient  for 
being  able  to  achieve  a  goal  by  performing  a  certain  action.  The  second  example  shows 
how  knowledge  can  be  acquired  by  using  an  action  as  a  test  The  third  example  involves 
carrying  out  a  sequence  of  actions,  first  performing  one  action  to  obtain  some  knowledge, 
and  then  using  that  knowledge  to  perform  another  action  that  achieves  a  goal. 

It  should  be  emphasized  that  we  are  not  attacking  the  problem  of  automatically 
generating  plans  which  take  into  account  the  acquisition  and  use  of  knowledge.  Rather,  as 
the  examples  suggest,  we  are  limiting  our  efforts  to  reasoning  about  how  knowledge 


interacts  with  a  given  action  or  sequence  of  actions.  Although  we  would  claim  that  this  is  a 
prerequisite  to  solving  the  planning  problem,  it  is  certainly  not  sufficient  by  itself. 

Besides  wanting  to  extend  the  capabilities  of  AI  systems  in  reasoning  about  actions, 
there  are  more  general  reasons  for  undertaking  this  study.  First  of  all,  there  is  a  need  for 
AI  to  break  out  of  what  might  be  termed  "the  blocks-world  syndrome".  Most  of  the  work  in 
A I  on  common-sense  reasoning  and  common-sense  problem  living  has  dealt  only  with 
discrete  physical  objects  and  physical  relations,  sometimes  operated  on  by  simple  sequences 
of  actions. 

This  leaves  a  multitude  of  representational  problems  untouched.  Some  of  these 
problems  include  modalities  such  as  possibility  ("It  might  be  the  case  that-"),  necessity  ("It 
must  be  the  case  that..."),  ability  ("John  can  do-"),  permissibility  ("John  may  do-"),  and 
obligation  ("John  should  do-").  Other  problems  include  counterfactual  conditionals  ("If  I 
had  struck  the  match,  it  would  be  burning  now."),  action  modifers  ("almost",  "quickly", 
"carefully"),  and  propositional  attitudes  ("wants",  "fears",  "believes",  "knows").  Most  AI 
systems  treat  time  as  a  sequence  of  discrete  states,  rather  than  as  a  continuum,  and  little 
work  has  been  done  on  reasoning  about  continuous  substances  such  as  water  and  air.  (This 
last  observation  is  due  to  Hayes  (1974).) 

There  is  a  large  literature  in  modern  philosophical  logic  in  many  of  these  areas  which 
seems  directly  applicable  to  AI  problems,  but  very  little  of  it  has  been  explored  by  the  AI 
community.  One  of  the  goals  of  this  thesis  is  to  take  the  ideas  of  philosophical  logicians  in 
one  particular  area,  in  this  case  the  logic  of  knowledge,  and  see  to  what  extent  they  can  be 
applied  to  A I  problems. 

Another  goal  of  the  thesis  is  to  provide  a  testbed  for  exploring  ideas  about  automatic 
deduction.  Most  work  on  automatic  deduction  has  been  done  in  the  area  of  mathematical 
theorem  proving,  with  what  is  generally  perceived  to  be  mixed  results.  Despite  years  of 
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effort,  no  theorem  proving  program  has  ever  proved  a  significant  new  result  in 
mathematics.  The  difficulty  in  evaluating  this  work  is  that  the  problems  are  to  hard  that  it 
is  not  clear  what  should  count  as  success.  After  all,  the  number  of  people  who  have  proved 
interesting  new  results  in  mathematics  is  miniscule  compared  to  the  number  who  can  solve 
the  block  stacking  problems  that  have  been  studied  in  AI. 

The  example  problems  which  we  will  look  at  do  not  share  this  uncertainty.  They  are 
clearly  solvable  by  anyone  of  normal  intelligence,  yet  they  will  turn  out  to  be  a  non  trivial 
test  for  our  deductive  system.  So  this  domain  gives  us  problems  that  are  rich  enough  to  be 
challenging,  but  easy  enough  that  we  are  sure  we  should  be  able  to  solve  them. 

The  first  problem  we  will  face  in  carrying  out  this  project  is  that  the  basic  facts  about 
knowledge  that  we  need  to  use  are  most  naturally  expressed  as  a  modal  logic  (Hughes  and 
Cresswell,  1968).  So  far,  no  satisfactory  way  of  applying  automatic  deduction  techniques 
directly  to  modal  logics  has  been  developed.  In  chapter  2.  we  first  discuss  what  properties 
of  knowledge  we  need  to  formalize,  and  explain  the  computational  problems  created  by 
some  simple  approaches.  Fortunately,  we  can  get  around  these  problems  by  making  use  of 
the  possible-world  semantics  for  the  logic  of  knowledge  developed  by  Hintikka  (1962,  1969), 
based  on  the  possible-world  semantics  for  necessity  of  Kripke  (1963a,  1963b).  The 
approach  which  we  will  pursue  is  to  axiomatize  this  model  theo.  /  for  the  logic  of 
knowledge  directly  in  first-order  logic.  We  will  have  axioms  which  say  such  things  as  that 
A  knows  that  P  if  and  only  if  P  is  true  in  every  possible  world  which  is  compatible  with 
what  A  knows.  In  this  way,  we  can  reason  about  simple  relations  among  possible  worlds 
rather  than  troublesome  modal  operators  like  Know.  In  the  rest  of  chapter  2,  we  briefly 
discuss  applying  the  same  approach  to  reasoning  about  belief,  consider  in  detail  the  issues 
raised  by  the  introduction  of  quantifiers  and  equality  into  knowledge  contexts,  and  review 
some  alternative  approaches  that  have  been  proposed. 
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In  chapter  3  we  show  how  to  extend  this  approach  to  reasoning  about  the  interaction  of 
knowledge  and  action.  The  key  ideas  are  (I)  to  recast  McCarthy's  situation  calculus 
(McCarthy,  1963),  (McCarthy  and  Hayes,  1969),  as  a  modal  logic  with  a  corresponding 
possible-world  semantics,  and  (2)  to  unify  this  formalism  with  the  one  for  knowledge  by 
identifying  possible  worlds  in  the  formalism  for  knowledge  with  situations  in  the  formalism 
for  actions.  We  show  how  to  describe  both  the  dependence  of  ~ction  on  knowledge  in  terms 
of  knowing  precisely  what  action  to  perform,  and  how  to  describe  the  effects  of  action  on 
knowledge  in  terms  of  relations  between  possible  worlds. 

In  chapter  4,  we  present  the  details  of  a  first-order  axiomatization  of  the  possible-world 
semantics  for  knowledge,  and  illustrate  its  use  with  a  number  of  examples  of  formal 
deductions.  In  chapter  5  we  extend  the  formalism  to  handle  our  integrated  theory  of 
knowledge  and  action,  giving  more  examples. 

In  chapter  6,  we  turn  to  problems  of  automatically  generating  deductions  involving 
statements  about  knowledge.  We  first  present  an  outline  of  an  automatic  deduction  system 
in  which  certain  formulas  are  given  procedural  interpretations.  This  is  in  the  tradition  of 
PLANNER  (Hewitt,  1972)  and  related  formalisms,  and  was  studied  in  detail  by  Moore 
(1975).  We  then  discuss  the  appropriate  procedural  interpretations  for  facts  about 
knowledge,  and  show  how  to  use  these  interpretations  to  generate  some  simple  deductions. 

In  chapter  7  we  consider  automatic  deductions  involving  both  knowledge  and  action, 
and  discuss  in  detail  the  choice  of  procedural  interpretations  for  the  axioms  describing 
dialing  the  combination  of  a  safe  and  reading  a  piece  of  paper.  This  chapter  concludes 
with  detailed  explanations  of  automatically  generated  deductions  of  the  three  sample 
problems  given  in  this  section.  Finally,  chapter  8  summarizes  and  evaluates  our  results,  and 
suggests  possible  extensions. 

This  thesis  makes  a  number  of  substantial  original  contributions.  One  of  these  pointing 


out  the  efficiency  advantages  of  using  a  first-order  formalization  of  the  possible-world 
semantics  of  a  modal  logic  of  knowledge  and  action  over  some  of  the  more  obvious 
approaches  to  reasoning  directly  in  the  modal  logic  itself.  The  idea  of  doing  deductions  in 
modal  logics  indirectly  using  first-order  formalizations  of  their  semantics  has  been  suggested 
a  few  times  in  the  AI  literature  (McCarthy  and  Hayes,  1969),  (Morgan,  1976),  but  has  not 
been  extensively  pursued.  Our  point  is  that  the  possible-world  approach  is  not  merely  one 
of  several  possible  ways  of  reducing  a  modal  logic  to  an  ordinary  first-order  logic,  but  that 
it  has  important  properties  that  enable  standard  deduction  techniques  to  be  used  with 
reasonable  efficiency.  The  reasons  for  this  are  explained  in  section  2.3,  with  analysis  of  the 
shortcomings  of  alternative  approaches  being  given  in  sections  2.2  and  2.6. 

Also,  our  formalism  seems  to  be  the  first  using  this  approach  to  give  a  fully  adequate 
treatment  of  quantification  and  equality  for  the  logic  of  knowledge.  The  only  previous 
application  (of  which  I  am  aware)  of  a  first-order  formalization  of  the  possible-world 
semantics  for  modal  logics  to  reasoning  about  knowledge  is  in  some  unpublished  notes  by 
McCarthy  and  one  of  his  students  (McCarthy,  1975),  (Coad,  1976).  Their  work  considers 
only  a  propositional  form  of  the  logic  of  knowledge,  however.  Our  formalism,  on  the  other 
hand,  deals  with  a  full  quantified  logic  of  knowledge  with  equality,  which  is  essential  for 
carrying  out  the  inferences  about  knowledge  and  action  that  we  want  our  system  to  handle. 

The  ideas  on  Integrating  reasoning  about  knowledge  and  action  seem  to  be  entirely 
new.  The  main  contributions  here  are  the  idea  of  describing  the  effects  of  actions  in  terms 
of  a  modal  logic  parallel  to  the  modal  logic  for  knowledge,  unifying  the  two  logics  by 
identifying  the  situations  in  the  semantics  of  the  logic  of  actions  with  possible  worlds  in  the 
semantics  of  the  logic  of  knowledge,  analyzing  the  knowledge  preconditions  for  actions  in 
terms  of  knowing  what  action  to  perform,  and  describing  the  effects  of  actions  on 
knowledge  in  terms  of  relations  between  possible  worlds.  These  ideas  are  the  major 


theoretical  contribution  of  this  thesis,  and  they  make  it  possible  to  do  reasoning  about 
knowledge  and  action  with  the  kind  of  generality  that  we  are  seeking.  For  instance,  they 
make  it  possible  to  derive  the  properties  of  tests  which  we  discussed  in  the  previous  section 
from  our  general  theory  of  knowledge  and  action.  The  possible-world  semantics  for  action 
also  provides  a  very  attractive  picture  of  the  relation  between  procedures  and  processes.  It 
falls  out  naturally  from  this  semantics  that  a  procedure  is  a  description  of  a  process. 
Technically,  the  denotation  of  a  certain  procedure  in  a  certain  environment  is  the  process 
which  results  from  executing  the  procedure  in  the  environment. 

Finally,  no  other  work  has  seriously  investigated  the  problems  of  doing  automatic 
deductions  in  this  domain.  Most  of  the  techniques  we  present  are  not  new  (although  many 
of  them  are  due  to  this  author  (Moore,  1975)),  but  applying  them  to  our  formalism  requires 
extensive  and  subtle  analysis.  In  fact,  it  is  probably  fair  to  say  that  this  is  the  most  complex 
formalization  of  a  common-sense  domain  to  which  these  sorts  of  techniques  have  been 
applied  and  represents  their  most  severe  test  to  date 
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2.  Reasoning  about  Knowledge 

2.1  Formalising  Properties  of  Knowledge 

Since  techniques  for  reasoning  about  action  have  been  extensively  studied  in  AI,  while 
techniques  for  reasoning  about  knowledge  have  not,  we  will  attack  the  problems  of 
reasoning  about  knowledge  first  In  chapter  S  we  will  see  that  the  formalism  that  we  are  led 
to  as  a  solution  to  these  problems  turns  out  to  be  welt  suited  for  an  integrated  system  for 
reasoning  about  both  knowledge  and  action. 

The  first  step  in  devising  a  formalism  for  reasoning  about  knowledge  is  to  decide  what 
general  properties  of  knowledge  we  want  that  formalism  to  capture.  It  should  be 
emphasised,  however,  that  we  are  not  going  to  attempt  to  define  what  knowledge  is.  That 
enterprise,  which  belongs  to  the  branch  of  philosophy  called  epistemology,  has  been  going 
on  for  several  thousand  years  without  reaching  a  consensus,  and  we  cannot  hope  to  solve 
the  problem  in  this  thesis.  More  importantly,  it  is  not  necessary  to  solve  that  problem  for 
our  purposes.  The  goal  of  epistemology  is  to  have  an  explanatory  theory  of  knowledge, 
whereas  all  we  need  is  a  descriptive  theory.  We  can  perfectly  well  do  common-sense 
reasoning  about  knowledge  without  having  a  theory  of  epistemology,  Just  as  we  can  do 
common-sense  reasoning  about  physical  ob jects  without  having  a  theory  of  physics.  What 
we  will  need  to  do  is  to  specify  some  of  the  basic  properties  of  the  common-sense  notion  of 
knowledge,  or  more  precisely,  a  common-sense  notion  of  knowledge  that  is  useful  for 
reasoning  about  planning  and  acting.  Any  philosphical  theory  of  knowledge  that  explains 
these  properties  would  be  acceptable  from  this  point  of  view,  but  it  is  not  necessary  for  us  to 
have  such  a  theory  to  achieve  our  goals. 

This  being  said,  the  properties  of  knowledge  that  we  will  be  most  interested  in 
formalizing  are  the  ones  that  are  relevant  to  planning  and  acting.  One  such  property  is 


that  anything  that  someone  knows  is  true,  it  is  impossible  to  have  false  knowledge.  If  P  is 
false,  we  would  not  want  to  say  that  John  knows  P.  We  might  say  that  John  believes  P  or 
that  John  believes  he  knows  P,  but  if  P  is  false,  then  it  simply  could  not  be  the  case  that 
John  knows  P. 

This  is,  of  course,  a  major  difference  between  knowledge  and  belief.  If  we  say  that 
John  believer  P,  we  are  not  committed  to  saying  that  P  is  either  true  or  false,  but  if  we  say 
that  John  knows  P,  we  are  committed  to  the  truth  of  P.  The  reason  that  this  distinction  is 
important  for  planning  and  acting  is  simply  that  for  an  agent  to  achieve  his  goals,  the 
beliefs  that  he  bases  his  actions  on  must  generally  be  true.  After  all,  merely  believing  that 
performing  a  certain  action  will  bring  about  a  desired  goal  is  not  sufficient  for  being  able  to 
achieve  the  goal;  the  action  must  actually  have  the  intended  effect 

Another  fact  that  turns  out  to  be  important  for  planning  is  that  if  someone  knows 
something,  he  knows  that  he  knows  it  This  principle  is  often  required  for  reasoning  about 
plans  consisting  of  several  steps.  Suppose  an  agent  plans  to  use  Act|  to  achieve  his  goal, 
but  in  order  to  perform  Act]  he  needs  to  know  whether  P  is  true  and  whether  Q  is  true. 
Suppose  further  that  he  already  knows  that  P  is  true,  and  can  find  out  whether  Q  is  true  by 
performing  Act2.  The  agent  needs  to  be  able  to  reason  that  after  performing  Act]  he  will 

know  whether  P  is  true  and  whether  Q  is  true.  We  will  be  willing  to  assume  that  he  knows 
that  he  will  know  whether  Q  is  t»«e  if  he  understands  the  effects  of  Aet2,  but  how  does  he 

know  that  he  will  know  whether  P  is  true?  Presumably  it  works  something  like  this:  He 
knows  that  P  is  true,  so  he  knows  that  he  knows  that  P  is  true,  and  he  knows  how  Act2 
affects  P,  so  he  knows  that  he  will  know  whether  P  is  true  after  he  performs  Act2.  The  key 
step  in  this  argument  is  an  Instance  of  the  principle  that  if  someone  knows  something,  he 
knows  that  he  knows  it. 

It  might  seem  that  we  would  also  want  to  have  the  principle  that  if  someone  doesn’t 
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know  something  he  knows  that  he  doesn't  know  it,  but  this  turns  out  to  be  false.  Suppose 
that  John  believes  that  P,  but  in  fact,  P  is  not  true.  Since  P  is  false,  John  certainly  doesn’t 
know  that  P,  but  it  is  highly  unlikely  that  he  knows  that  he  doesn't  know,  since  he  thinks 
that  P  is  true. 

Probably  the  most  important  fact  about  knowledge  that  we  will  want  to  capture  is  that 
people  can  reason  based  on  their  knowledge.  All  of  the  examples  we  have  given  depend  on 
the  assumption  that  if  an  agent  trying  to  solve  a  problem  has  all  the  relevant  information, 
he  will  apply  his  knowledge  to  get  a  solution.  This  presents  a  difficulty  for  us,  however, 
since  people  don’t,  in  fact,  know  all  the  logical  consequences  of  their  knowledge.  The 
trouble  is  that  we  never  can  be  sure  which  of  the  inferences  that  a  person  could  make,  he 
will  make.  I  believe  that  the  best  solution  is  to  adopt  the  principle  that  if  P  is  implied  by 
what  someone  knows,  then  he  also  knows  P.  but  to  treat  it  as  a  "plausible  implication". 

By  "plausible  implication",  we  will  mean  an  implication  schema  that  we  will  accept  in 
any  particular  case  unless  we  have  other  Information  to  the  contrary.  A  plausible 
implication,  then,  would  behave  just  like  an  ordinary  implication  so  long  is  nothing  is 
inferred  which  contradicts  a  previous  conclusion.  If  only  one  plausible  inference  is  made  in 
a  chain  of  reasoning  that  leads  to  a  contradiction,  then  that  inference  is  almost  certainly  the 
one  which  should  be  withdrawn.  If  more  than  one  plausible  inference  is  involved,  then  we 
get  into  the  complicated  problem  of  choosing  between  alternative  plausible  views  of  the 
world  (McDermott,  1974),  (Doyle,  1978).  This  is  a  very  general  problem  of  which  the  effects 
of  adopting  the  proposed  principle  are  only  one  example.  However,  since  our  examples 
involve  such  mundane  bits  of  reasoning  that  it  is  extemely  unlikely  that  any  intelligent 
agent  would  fail  to  make  them,  we  will  treat  the  principle  as  if  it  were  an  ordinary 
implication,  and  not  consider  the  problem  any  further. 

Finally,  we  will  need  to  include  the  fact  that  these  basic  properties  of  knowledge  are 


themselves  common  knowledge.  By  this  we  mean  that  everyone  knows  them,  and  everyone 
knows  that  everyone  knows,  and  everyone  knows  that  everyone  knows  that  everyone  knows, 
etc  This  type  of  principle  is  obviously  needed  when  reasoning  about  what  someone  knows 
about  what  someone  else  knows,  but  it  is  also  important  in  planning,  because  an  agent  must 
be  able  to  reason  about  what  he  will  know  at  various  times  in  the  future.  In  such  a  case, 
his  "future  self"  is  analogous  to  another  person. 

In  his  pioneering  work  on  the  logic  of  knowledge  and  belief,  Hintikka  (1962)  presents  a 
formalism  that  captures  all  these  properties.  We  will  define  a  formal  logic  based  on 
Hintikka's  ideas,  but  modified  somewhat  to  be  more  compatible  with  the  additional 
developments  in  this  thesis.  So,  what  follows  is  similar  to  the  system  developed  by  Hintikka 
in  spirit,  but  not  in  detail. 

The  language  of  this  system  is  the  language  of  propositional  logic  augmented  with  the 
operator  Know.  The  formula  Know(A,P)  is  interpreted  to  mean  that  the  person  denoted  by 
the  term  A  knows  the  proposition  corresponding  to  the  formula  P.  So  if  John  refers  to  John 
and  Likoa(8ill,Mary)  means  that  Bill  likes  Mary,  Know(John,Likas(Bill,M«ry))  means  that  John 
knows  that  Bill  likes  Mary.  The  closure  of  the  following  axiom  schemata  with  respect  to 
the  inference  rule  modus  ponens  (from  (P  a  Q)  and  P,  Infer  Q)  defines  the  theorems  of  the 
system: 

Ml.  Axioms  of  ordinary  propositional  logic  (e.g.  as  in  Rogers  (1971)) 

M2.  Know(A.P)  a  P 

M3.  Know(A,P)  a  Know(A,Know(A,P}) 

M4.  Know(A,(P  a  Q))  a  (Know(A,P)  a  Know(A,Q)) 

M5.  If  P  is  an  axiom,  then  Know(A,P)  is  an  axiom. 

This  system  is  very  similar  to  the  systems  studied  in  modal  logic  In  fact,  if  A  is  held 
fixed,  the  resulting  system  is  Isomorphic  to  the  modal  logic  $4  (Hughes  and  Cresswell,  1968). 
We  will  refer  to  this  system  as  the  modal  logic  of  knowledge. 

These  axioms  formalize  in  a  straightforward  way  the  principles  for  reasoning  about 
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knowledge  which  we  discussed.  M2  says  that  anything  that  is  known  is  true.  MS  says  that 
if  someone  knows  something,  he  knows  that  he  knows  it.  Ml  says  that  if  someone  knows  a 
formula  P  and  a  formula  of  the  form  (P  3  Q).  then  he  knows  the  corresponding  formula  Q. 
That  is,  everyone  can  (and  does)  apply  modus  ponens.  M5  is  a  recursive  schema  which  tells 
us  that  alt  the  axioms  are  common  knowledge.  It  first  applies  to  Ml  -  M4,  which  says  that 
everyone  knows  the  basic  facts  about  knowledge,  but  it  also  applies  to  its  own  output,  so  we 
get  axioms  that  say  that  everyone  knows  that  everyone  knows,  etc  Since  M5  applies  to  the 
axioms  of  propositional  logic  (Ml),  we  can  infer  that  everyone  knows  the  facts  they 
represent.  Furthermore,  since  modus  ponens  is  the  only  inference  rule  needed  in 
propositional  logic,  the  presence  of  M4  will  enable  us  to  infer  that  someone  knows  any 
propositional  consequence  of  his  knowledge. 

2.2  Computational  Problems  in  Reasoning  about  Knowledge 

The  modal  logic  of  knowledge  that  we  have  just  presented  is  an  elegant  and  concise 
formalization  of  the  properties  of  knowledge  that  we  want  to  capture,  but  it  has  one  major 
drawback  as  a  basis  for  AI  systems  for  reasoning  about  knowledge,  so  far,  there  have  not 
been  devised  any  satisfactory  ways  of  applying  automatic  deduction  techniques  directly  to 
systems  of  this  type.  The  reason  that  the  standard  techniques  cannot  be  applied  directly  is 
that  Know  is  an  Intentional  rather  than  an  extenslonal  operator.  Classical  logics  are 
extensions!  because  the  truth  value  of  a  complex  formula  depends  only  on  the  extensions, 
or  denotations,  of  its  subexpressions.  (The  denotation  of  a  term  is  the  individual  it  refers 
to,  the  denotation  of  a  predicate  symbol  is  the  set  of  individuals  that  satisfy  it,  and  the 
denotation  of  a  formula  is  its  truth  value.)  For  instance,  the  truth  of  (P  v  Q)  depends  only 
on  the  truth  of  P  and  the  truth  of  Q;  no  other  properties  of  P  and  Q  matter.  In  particular, 
the  intensions,  or  meanings,  of  P  and  Q  are  irrelevant,  except  in  so  far  as  meaning 


determines  truth  value.  This  is  true  of  both  first-order  and  higher-order  classical  logics. 
This  restriction  was  recognized  by  the  founders  of  modern  logic  (Whitehead  and  Russell, 
1910),  and  it  is  one  of  their  great  triumphs  that  they  succeeded  in  formalizing  essentially  all 
of  mathematics  within  a  purely  extensional  framework. 

Knowing,  on  the  other  hand,  is  an  intensional  notion  because  the  truth  of  "A  knows  that 
P,"  depends  generally  on  the  meaning  of  P,  rather  than  just  its  -ruth  value.  The  truth  value 
of  P  is  clearly  important,  since  it  is  impossible  to  know  a  false  proposition,  but  it  is  not  the 
whole  story,  since  it  is  possible  to  know  some  true  propositions  and  not  know  other  true 
propositions.  The  standard  techniques  for  automatic  deduction  have  been  worked  out  only 
for  extensional  formalisms,  so  we  either  have  to  extend  the  known  techniques  or  find  a  way 
to  convert  the  modal  logic  of  knowledge  into  an  extensional  formalism. 

To  see  what  the  difficulties  really  are  we  will  examine  some  simple  approaches  and 
point  out  where  they  fail.  Suppose  that  we  have  a  system  for  doing  deductions  in 
propositional  logic  and  we  simply  add  formulas  of  the  form  Know(A,P)  to  the  data  base.  It 
is  easy  to  imagine  that  such  a  system  could  match  formulas  like  this,  even  though  they  are 
not  propositional  in  the  strictest  sense,  and  be  able  to  infer  things  like  Know(A,Q)  from 
Know(A.P)  and  Know(A,P)  o  Know(A,Q).  The  system  would  not,  however,  be  able  to  do  any 
inferences  that  depend  on  the  occurrence  of  logical  operators  inside  of  a  Know  operator. 
The  rules  of  propositional  logic  alone  would  not  be  sufficient  to  infer  Know(A,Q)  from 
Know(A,P)  and  Know(A,(P  o  Q)). 

The  obvious  next  step  would  be  to  ad  1  the  axioms  of  the  modal  logic  of  knowledge  to 
the  data  base.  This  might  present  something  of  a  problem,  since  they  are  schemata  rather 
than  simple  axioms  (M5  would  seem  to  be  particularly  troublesome),  but  we  wilt  let  this 
pass,  since  there  are  more  severe  difficulties  to  follow.  This  would  produce  a  system  capable 
of  doing  all  the  deductions  permitted  by  our  logic  of  knowledge,  but  would  be  horrendously 
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inefficient  if  done  in  the  obvious  way.  Basically,  this  move  reduces  the  role  of  the 
underlying  deductive  system  to  that  of  a  simple  interpreter,  with  the  real  control  structure  of 
the  deductive  process  being  encoded  in  the  axioms.  Using  a  set  of  axioms  to  specify  a 
procedure  is  not  necessarily  inefficient  Indeed,  recent  work  in  "logic  programming"  (e.g. 
Kowalski  (1974))  is  based  on  this  very  notion.  Sussman  and  his  colleagues  (de  Kleer,  et  a  I., 
1977)  also  use  axioms  to  specify  control  information,  but  in  a  quite  different  way.  The 
point  is  that  such  axiom  sets  must  be  carefully  designed  to  produce  reasonable  procedures 
when  interpreted.  The  modal  logic  of  knowledge  which  we  have  been  considering  was 
obviously  not  designed  for  that  purpose.  Consider  deductions  involving  the  axiom  schema 
M4: 


Know(A,(P  3  0))  a  (Know(A,P)  3  Know(A,Q)). 

How  should  this  schema  be  used?  if  we  use  it  to  add  new  facts  to  the  data  base,  it  will 
match  any  formula  of  the  assertion  of  the  form  Know(A,(P  3  Q)),  producing  a  new  assertion 
of  the  form  (Know(A,P)  o  Know{A,Q».  Used  this  way,  however,  M4  will  interact  with  M5  to 
add  infinitely  many  new  assertions  to  the  data  base.  The  result  of  applying  M5  once  to  M4 
could  be  written  as: 

Know(8,Know(A,(P  3  Q))  3  (Know(A.P)  3  Know(A,Q»), 
but  M4  would  apply  to  this  in  turn  to  produce. 

Know(B,Know(A,(P  3  Q)))  s  Know(B,(Know(A,P)  3  Know(A.Q))). 

This  would  apply  to  the  schema  that  results  from  applying  M5  twice  to  M4,  eventually 
producing  a  formula  that  would  apply  to  the  axioms  that  result  from  applying  M5  three 
times  to  M4,  etc.,  ultimately  producing  analogues  of  M4  for  all  depths  of  nesting  of  Know. 

So,  to  avoid  generating  infinitely  may  new  assertions,  we  would  have  to  use  M4  as  a 
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subgoal  generator,  if  it  is  to  be  used  at  all.  Used  in  this  way,  it  would  match  any  goal  of  the 
form  Know(A,Q),  generating  the  conjunctive  subgoals  Know(A,(P  a  Q))  and  Know(A,P). 
Whichever  of  these  goals  is  attempted  first,  M4  applies  again,  producing  still  more 
complicated  goals.  It  is  possible  to  continue  in  this  way  to  an  arbitrary  depth  without  ever 
considering  any  substantive  facts  relevant  to  deducing  the  original  goal. 

The  fundamental  problem  with  this  approach  is  that  no  matter  how  smart  the  basic 
deductive  system  is,  the  axioms  for  knowledge  seize  control  of  the  deductive  process.  These 
axioms  were  originally  selected  for  their  elegance  and  brevity,  not  for  efficent  generation  of 
proofs.  What  seems  to  be  needed,  then,  is  a  way  of  running  the  basic  deduction  system 
"inside'*  the  operator  Know. 

One  way  of  producing  such  a  system  would  be  to  avoid  having  axioms  for  knowledge 
altogether  by  finding  a  computational  analogue  of  knowing,  some  computational  structure 
which  can  serve  as  a  direct  representation  of  someone  knowing  something.  There  is  an  idea 
along  these  lines  that  initially  seems  very  appealing.  Using  the  multiple  data-base 
capabilities  of  advanced  AI  languages,  we  could  set  up  a  separate  data  base  for  each  person 
whose  knowledge  we  have  some  information  about.  We  then  can  record  what  we  know 
about  his  knowledge  in  that  data  base,  and  simulate  his  reasoning  by  running  our  standard 
inference  routines  in  that  data  base.  This  would  allow  us  to  eliminate  any  explicit  reference 
to  knowing  individual  facts,  and  so  avoids  dealing  with  the  modal  operator  Know.  This 
idea  seems  to  have  wide  currency  in  Al  circles,  and  I  advocated  it  myself  in  an  earlier  paper 
(Moore.  1973). 

This  idea  handles  simple  statements  about  knowledge  quite  well.  Suppose  that  OB^ 
contains  what  we  believe  to  be  true  about  the  real  world.  If  we  want  to  assert  that  John 
knows  that  P  we  would  create  a  new  data  base,  DBrwj0)ln,  assert  P  in  this  new  data  base, 

and  set  a  pointer  in  the  old  data  base  to  indicate  where  we  can  find  information  about 


John’s  knowledge.  Furthermore,  to  assert  that  John  knows  that  Bill  knows  that  P,  all  we 
have  to  do  is  iterate  this  process.  We  create  a  third  data  base,  DBrw.john.bill'  ass<rt  p  >n  this 
data  base,  and  set  a  pointer  to  it  in  DB,^  j^  labeled  "Bilfs  knowledge".  Since  this  is 

already  in  a  data  base  which  is  restricted  to  John’s  knowledge,  it  would  automatically  be 
interpreted  as  what  John  knows  about  what  Bill  knows. 

If  we  want  to  make  assertions  that  are  logically  more  complex,  however,  we  run  into 
trouble.  Consider  the  problem  of  representing  "John  knows  that  P  or  John  knows  that  Q." 
We  can’t  represent  this  by  simply  adding  (P  v  Q)  to  DB^  john.  because  this  would  mean 

"John  knows  that  P  or  Q,"  •  something  quite  different.  We  could  set  up  two  data  bases, 
DBrwjohnl  and  0Brw.john2>  add  p  t0  one  and  Q  to  the  other,  and  then  assert  in  DB,^ 
"DBf-w.johni  "presents  John’s  knowledge,  or  DBrw  john2  represents  John’s  knowledge." 
However,  if  we  also  wanted  to  assert  "John  knows  that  R,  or  John  knows  that  S,  or  John 
knows  that  T,"  we  would  need  six  data  bases  to  represent  all  the  possibilites  for  John's 
knowledge  •  one  for  each  of  the  combinations  P  and  R,  Q  and  R,  P  and  S,  etc.  As  we  add 
more  disjunctive  assertions,  we  get  a  combinatorial  explosion  in  the  number  of  data  bases. 
A  more  sophisticated  approach  might  retain  the  modal  operator  Know  for  the  basic 
representation  and  convert  to  the  data  base  representation  only  after  the  specific  facts 
relevant  to  the  problem  at  hand  have  been  identified,  thus  limiting  the  combinatorics. 
Stallman  and  Sussman  (1976)  and  Doyle  (1978)  have  worked  out  advanced  techniques  for 
handling  multiple  data  bases  that  might  be  useful  in  this  approach,  but  the  details  remain 
to  be  worked  out. 

A  more  serious  problem  is  representing  what  someone  doesn't  know.  Suppose  we  want 
to  represent  "John  doesn't  know  that  P."  We  can’t  add  -P  to  DBrw  j0hn.  because  this  would 
be  asserting  "John  knows  that  -P,"  and  simply  omitting  P  from  OBrwjohn  means  that  we 
don't  know  whether  John  knows  that  P. 
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I  have  heard  two  suggestions  as  to  how  this  problem  might  be  overcome.  One 
suggestion  is  that  we  change  conventions  so  that  omitting  something  from  DB,^  means 
that  John  doesn't  know  it.  This  would,  however,  require  explicitly  representing  all  aspects 
of  John's  knowledge  of  which  we  are  ignorant,  a  prospect  which  seems  far  more 
troublesome  than  the  original  problem. 

The  other  suggestion  is  to  let  the  logic  used  in  the  data  bases  be  three-valued  -  true, 
false,  and  undefined.  If  P  is  marked  as  true  in  DB,^  then  John  knows  that  P;  if  P  is 
marked  as  false,  then  John  knows  -P;  if  P  is  marked  as  undefined,  then  John  doesn’t  know 
one  way  or  the  other.  This  doesn't  work  for  multiple  embeddings  of  Know,  however. 
Representing  "John  knows  that  Bill  doesn't  know  whether  P,"  is  no  problem.  We  simply 
mark  P  as  undefined  in  j0hn  bill-  ®ut  h°w  can  we  represent  "John  doesn’t  know 

whether  Bill  knows  that  P.”?  There  is  no  assertion  in  DB,^  john  to  mark  as  undefined, 

because  "Bill  knows  that  P,"  is  represented  implicitly  by  asserting  P  in  DBrw.john.bill- 

It  appears,  then,  that  what  John  doesn't  know  has  to  be  kept  separate  from  what  he 
does  know.  But  there  are  inferences  that  require  looking  at  both.  For  example,  if  we  have 
"John  doesn’t  know  that  P,"  and  "John  knows  that  Q  implies  P,"  we  might  want  to  conclude 
that  "John  doesn't  know  that  Q,"  is  probably  true. 

This  is  representative  of  a  class  of  inferences  that  the  data  base  approach  doesn't 

capture.  There  seems  to  be  a  fundamental  problem  in  saying  things  about  a  person's 

knowledge  that  go  beyond  simply  enumerating  what  he  knows.  There  may  be  ways  of 
getting  around  these  difficulties,  but  it  is  clear  that  any  adequate  solution  is  going  to  be 
much  more  complex  than  "just  using  data  bases". 
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2.S  Possible-World  Semantics  for  Knowledge 

So  far,  all  of  the  proposals  that  we  have  seen  for  reasoning  directly  about  formulas  of 
the  form  Know(A,P)  have  led  to  problems.  There  may  well  be  solutions  to  these  problems, 
but  it  turns  out  that  they  can  be  circumvented  entirety  by  changing  the  language  we  use  to 
describe  what  people  know.  Rather  than  talk  about  the  individual  statements  that  someone 
knows  we  will  talk  instead  about  what  states  of  affairs  are  compatible  with  what  he  knows. 
In  philosophy,  these  states  of  affairs  are  usually  called  “possible  worlds",  so  we  will  adopt 
that  term  as  well. 

This  move  to  describing  knowledge  in  terms  of  possible  worlds  is  based  on  a  rich  and 
elegant  formal  semantics  for  systems  tike  our  modal  logic  of  knowledge  that  was  developed 
by  Hintikka  (1962,  1969)  in  his  work  on  knowledge  and  belief.  The  advantage  of  this 
approach  is  that  it  can  be  formalized  within  ordinary  first-order  classical  logic  in  a  way 
that  permits  the  use  of  standard  automatic  deduction  techniques  in  a  reasonably  efficient 
manner. 

Possible-world  semantics  was  first  developed  for  the  logic  of  necessity  and  possibility.  It 
is  a  very  old  idea  in  philosophy  (usually  attributed  to  Leibniz)  that  a  proposition  is 
necessarily  true  if  and  only  if  it  is  true  in  all  possible  worlds.  Conversely,  a  proposition  is 
possibly  true  if  and  only  if  there  is  some  possible  world  where  it  is  true.  Intuitively,  a 
possible  world  may  be  thought  of  as  a  set  of  circumstances  that  might  have  been  true  in  the 
actual  world.  This  informal  analysis  leaves  many  questions  unanswered,  however.  We 
have  said  that  a  necessary  truth  must  be  true  in  all  possible  worlds,  but  must  it  be 
necessarily  true  in  all  of  them?  Or,  are  possibly  true  propositions  necessarily  possible? 
Axiomatizations  of  modal  logics  have  proliferated  as  philosophers  have  argued  various 
sides  of  questions  such  as  these. 

In  the  early  1960's,  the  development  of  formal  possible-world  semantics  provided  a 


unifying  framework  for  viewing  these  various  axiom  systems.  The  key  new  idea  was  to 
regard  different  worlds  as  being  possible,  not  absolutely,  but  only  relative  to  other  worlds. 
That  is.  the  world  Wj  might  be  a  possible  alternative  to  W2,  but  not  to  W3.  The  structure 
of  which  worlds  are  possible  alternatives  to  which  other  worlds  is  said  to  define  an 
accessibility  relation.  The  high  point  in  the  development  of  this  theory  came  when  Kripke 
(1963a)  proved  that  the  differences  among  some  of  the  most  important  proposed  axiom 
systems  for  modal  logic  corresponded  exactly  to  certain  restrictions  on  the  accessibility 
relation  of  the  possible-world  models  of  tho'”  ystems.  These  results  are  reviewed  in  Kripke 
(1963b). 

Concurrent  with  these  developments,  Hintikka  (1962)  published  the  first  of  his  work  on 
the  logic  of  knowledge  and  belief,  which  included  a  model  theory  that  was  much  like 
Kripke's  possible-world  semantics.  Hintikka's  original  semantics  was  done  in  terms  of  sets 
of  sentences,  which  he  called  model  sets,  rather  than  possible  worlds.  Later  (Hintikka,  1969). 
however,  he  recast  his  semantics  into  Kripke’s  terms,  and  it  is  that  formulation  which  we 
will  use  here. 

Kripke's  semantics  for  necessity  and  possibility  can  be  converted  into  Hintikka’s 
semantics  for  knowledge  by  changing  the  interpretation  of  the  accessibility  relation.  In 
order  to  analyze  statements  of  the  form  Know(A,P),  we  will  introduce  a  relation  K,  such  that 
K(A,W|,W2)  means  that  the  possible  world  W2  is  compatible  or  consistent  with  what  A  knows 
in  the  possible  world  Wj.  In  other  words,  for  all  that  A  knows  in  W],  he  might  just  as  well 
be  in  W2.  It  is  the  set  of  worlds  {w2  |  K(A,W],w2»  that  we  will  use  to  characterize  what  A 
knows  in  Wj.  We  will  discuss  A’s  knowledge  in  Wj  in  terms  of  this  set,  the  set  of  states  of 
affairs  that  are  consistent  with  his  knowledge  in  Wj,  rather  than  in  terms  of  the  set  of 
propositions  that  he  knows.  For  the  present  we  will  assume  that  the  first  argument  position 
of  K  admits  the  same  set  of  terms  as  the  first  argument  position  of  Knew.  When  we  consider 


quantifiers  and  equality  in  section  2.5,  we  will  have  to  modify  this  assumption,  but  it  will  do 
for  now. 

Introducing  K  is  the  key  move  in  our  analysis  of  statements  about  knowedge,  so 
understanding  what  K  means  is  particularly  important  To  illustrate,  suppose  that  in  the 
actual  world  -  call  it  Wq  .  John  knows  that  P,  but  doesn’t  know  whether  Q.  If  Wj  is  a  world 
where  P  is  false,  then  Wj  is  not  compatible  with  what  John  knows  in  Wq,  so  we  would  have 
-K(John,WQ,W|).  Suppose  that  W2  and  W3  are  compatible  with  everything  John  knows,  but 
Q  is  true  in  W2  and  false  in  W3.  Since  John  doesn't  know  whether  Q  is  true,  for  all  he 
knows,  he  might  be  in  either  W2  or  W3  instead  of  Wq.  Hence,  we  would  have  both 
K(John,WQ,W2)  and  KlJohn.Wg^).  This  is  depicted  graphically  in  figure  2.1. 

Some  of  the  properties  of  knowledge  can  be  captured  by  putting  constraints  on  the 
accessibility  relation  K.  For  instance,  requiring  that  the  actual  world  Wq  be  compatible  with 
what  each  knower  knows  in  Wq,  i.e.,  VijOCUj.Wq.Wq)),  is  equivalent  to  saying  that  anything 
that  is  known  is  true.  That  is,  if  the  actual  world  is  compatible  with  what  everyone 
(actually)  knows,  then  no  one  has  any  false  knowledge.  This  corresponds  to  the  modal 
axiom  M2. 

The  definition  of  K  implies  that  if  A  knows  that  P  in  Wq,  then  P  must  be  true  in  every 
world  Wj  such  that  K(A,Wq,Wj).  To  capture  the  fact  that  people  can  reason  with  their 
knowledge,  we  will  assume  the  converse  is  also  true.  That  is,  we  assume  that  if  P  is  true  in 
every  world  Wj  such  that  K(A,Wq,Wj),  then  A  knows  that  P  in  Wq.  (See  figure  2.2.)  This 
principle  is  the  model-theoretic  analogue  of  axiom  M4  in  the  modal  logic  of  knowledge.  To 
see  that  this  is  $0,  suppose  that  A  knows  that  P  and  that  (P  o  Q).  Therefore,  P  and  (P  »  Q) 
are  both  uue  in  every  world  that  is  compatible  with  what  A  knows.  If  this  is  the  case, 
though,  then  Q  must  be  true  in  every  world  that  is  compatible  with  what  A  knows.  By  our 
assumption,  then,  we  conclude  that  A  knows  that  Q. 


2.1  "John  knows  that  P." 

"John  doesn't  know  whether  Q.' 


Figure  2.2  "A  knows  that  P." 

"P  is  true  in  every  world  which 
is  compatible  with  what  A  knows.'1 
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Since  this  assumption,  like  M4,  is  equivalent  to  saying  that  a  person  knows  all  the 
logical  consequences  of  his  knowledge,  it  should  be  interpreted  only  as  a  plausible 
implication.  In  a  particular  instance,  the  fact  that  P  follows  from  A’s  knowledge  would  be  a 
justification  for  concluding  that  A  knows  P.  However,  we  should  be  prepared  to  retract  the 
conclusion  that  A  knows  P  in  the  face  of  stronger  evidence  to  the  contrary. 

With  this  assumption,  we  can  get  the  effect  of  M3,  the  axiom  that  if  someone  knows 
something,  he  knows  that  he  knows  it,  by  requiring  that  for  any  Wj  and  W2,  if  Wj  is 

compatible  with  what  A  knows  in  W 0  and  W2  is  compatible  with  what  A  knows  in  Wj,  then 
W2  is  compatible  with  what  A  knows  in  W q.  Formally  this  is: 

,w j,w2(K(aj ,Wq,w j )  st  (K(aj,W|,w2)  »  K(«j,W0,w2))) 

By  our  previous  assumption,  the  facts  that  A  knows  are  the  facts  that  are  true  in  every 
world  that  is  compatible  with  what  A  knows  in  the  actual  world.  Furthermore,  the  facts 
that  A  knows  that  he  knows  are  those  that  are  true  in  every  world  that  is  compatible  with 
what  he  knows  in  every  world  that  is  compatible  with  what  he  knows  in  the  actual  world. 
By  the  constraint  we  have  just  proposed  however,  all  these  worlds  must  also  be  compatible 
with  what  A  knows  in  the  actual  world  (see  figure  2.3),  so  if  A  knows  that  P  he  knows  that 
he  knows  that  P. 

Finally,  we  can  get  the  effect  of  M5,  the  assertion  that  the  basic  facts  about  knowledge 
are  themselves  common  knowledge,  by  generalizing  these  constraints  so  that  they  hold  not 
only  for  the  actual  world  but  for  all  possible  worlds.  This  follows  from  the  fact  that  if  these 
constaints  hold  for  all  worlds,  they  hold  for  all  worlds  that  are  compatible  with  what  anyone 
knows  in  the  actual  world,  and  they  hold  for  all  worlds  that  are  compatible  with  what 
anyone  knows  in  all  worlds  that  are  compatible  with  what  anyone  knows  in  the  actual 
world,  etc  Therfore,  everyone  knows  the  facts  about  knowledge  that  the  constraints 


Figure  2.3  "If  A  knows  that  P,  then  he  knows  that  P.' 

represent,  and  everyone  knows  that  everyone  knows,  etc  Notice  that  this  generalization  has 
the  interesting  effect  that  the  constraint  that  corresponds  to  M2  becomes  the  requirement 
that  for  a  given  knower,  K  is  reflexive,  and  the  constraint  corresponding  to  MS  becomes  the 
requirement  that  for  a  given  knower,  K  is  transitive.  In  other  words,  for  each  knower,  K 
specifies  a  partial  ordering  on  the  set  of  possible  worlds. 

Analyzing  knowledge  in  terms  of  possible  worlds  gives  us  a  very  nice  treatment  of 
knowledge  about  knowledge.  Suppose  John  knows  that  Bill  knows  that  P.  Then  if  the 
actual  world  is  W0,  in  any  world  W|  such  that  K(Jehn,WQ,W|),  Bill  knows  that  P.  We  now 
continue  the  analysis  relative  to  Wj,  giving  us  that  in  any  world  W2  such  that  IC(Bill,W ( .Wj), 
P  is  true.  Putting  both  stages  together,  we  get  that  for  any  worlds  Wi  and  W2,  if 
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Figure  2.4  "John  knows  that  Bill  knows  that  P." 


K(John,W0,Wj)  and  K(Bill,Wj,W2),  then  P  is  true  in  W2.  (See  figure  2.4.)  This  is  somewhat 

similar  to  the  treatment  of  knowledge  about  knowledge  in  the  data  base  approach.  There 
we  used  chains  of  pointers  between  data  bases  to  represent  what  one  person  knows  about 
what  another  person  knows.  Here  we  are  using  chains  of  accessibility  relationships  between 
possible  worlds  for  the  same  purpose. 

Given  these  constraints  and  assumptions,  whenever  we  want  to  assert  or  deduce 
something  that  would  be  expressed  in  the  modal  logic  of  knowledge  by  Know(A,P),  we  can 
instead  assert  or  deduce  that  P  is  true  in  every  world  which  is  compatible  with  what  A 
knows.  We  can  express  this  in  ordinary  first-order  logic,  by  treating  possible  worlds  as 


individuals  (in  the  logical  sense),  so  that  K  is  just  an  ordinary  relation.  We  will  then 
introduce  an  operator  T  such  that  T(W,P)  means  that  the  formula  P  is  true  in  the  possible 
world  W.  If  we  let  W q  denote  the  actual  world,  then  we  can  convert  the  assertion  Know(A,P) 
into: 

VW|(K(A,Wq,W))  9  T(wj,P}) 

It  may  seem  that  we  haven't  made  any  real  progress,  since,  although  we  have  gotten  rid 
of  one  nondassical  operator,  Know,  we  have  introduced  another  one,  T.  T,  however,  has  an 
important  property  that  Know  does  not.  Namely,  T  "distributes*  over  ordinary  logical 
operators.  That  is.  -P  is  true  in  W  just  in  case  P  is  not  true  in  W,  (P  v  Q)  is  true  in  W  just  in 
case  P  is  true  in  W  or  Q  is  true  in  W,  etc  We  might  say  that  T  is  extensional,  relative  to  a 
possible  world.  (The  strict  sense  of  extensionality  requires  that  only  the  actual  world  be 
considered.)  Thus,  'n  contrast  to  Know,  logical  operators  cannot  become  "trapped"  Inside  of 
T  where  they  are  inaccessible  to  the  ordinary  inference  procedures.  This  means  that  we  can 
transform  any  formula  so  that  T  is  applied  only  to  atomic  formulas.  We  can  then  turn  T 
into  an  ordinary  first-order  relation  by  treating  all  the  nonintensional  atomic  formulas  as 
logical  individuals.  This  is  no  loss  to  the  expressive  power  of  the  language,  since  where  we 
would  have  previously  asserted  P,  we  simply  assert  T(Wq,P)  instead. 

In  this  way,  we  can  transform  a  modal  propositional  logic  with  the  nonstandard 
intensional  operator  Know  into  an  ordinary  first-order  theory  containing  the  relations  K  and 
T,  and  in  which  possible  worlds  and  the  atomic  formulas  of  the  modal  logic  are  treated  as 
individuals.  It  may  seem  that  we  have  introduced  notions  such  as  possible  worlds  and 
formulas  as  individuals  with  too  little  regard  for  whether  such  things  actually  exist,  i.e., 
without  worrying  whether  the  resulting  theory  is  actually  true.  The  answer  to  this  type  of 
objection  is  that  from  an  AI  point  of  view,  it  just  doesn't  matter.  What  we  are  seeking  are 


ways  of  creating  systems  that  exhibit  certain  desired  behaviors.  Any  notion  that  helps  us 
achieve  this  goal  is  an  acceptable  analytical  tool.  We  are  not  required  to  believe  that 
possible  worlds  "really  exist"  for  our  systems  to  work  any  more  than  the  electrical  engineer 
who  uses  complex  analysis  is  required  to  believe  that  imaginary  numbers  "really  exist"  for 
his  circuits  to  work. 

2.4  A  Note  on  Belief 

The  ideas  we  have  presented  for  formalizing  statements  about  knowledge  could  easily  be 
extended  to  handle  the  related  concept  of  belief.  We  could  give  a  modal  axiomatization  of 
belief  very  similar  to  the  one  for  knowledge,  the  main  difference  being  that  there  would  be 
no  analogue  to  M2,  the  axiom  that  states  that  anything  that  is  known  must  be  true.  In 
corresponding  fashion,  we  could  define  a  possible-world  semantics  for  this  theory.  This 
semantics  and  the  one  for  knowledge  would  differ  mainly  in  that  the  accessibility  relation 
for  belief  would  not  be  reflexive,  since  there  is  no  reason  to  expect  the  actual  world  to  be 
compatible  with  everything  that  someone  believes.  In  other  words,  we  would  want  to  allow 
for  false  beliefs. 

It  might  even  be  argued  that  we  ought  to  take  belief  as  the  more  fun  lamental  notion 
and  define  knowledge  in  terms  of  belief.  We  have  two  reasons  for  not  doing  this,  one 
theoretical  and  one  practical.  The  theoretical  reason  is  that  it  is  not  at  all  clear  that 
knowledge  can  be  defined  in  terms  of  belief.  The  idea  that  knowledge  is  simply  true  belief 
would  probably  not  get  us  into  trouble  in  the  examples  in  this  thesis,  but  it  is  certainly  not 
correct  in  general.  For  example,  a  compulsive  gambler  who  firmly  believes  that  the  number 
he  has  chosen  will  hit  has  no  better  claim  to  knowledge  on  those  rare  occasions  when  he 
guesses  right  than  on  the  many  occasions  when  he  is  wrong.  Knowledge,  therefore,  is  more 
than  simply  true  belief.  Exactly  what  else  is  required  is  one  of  the  classical  questions  of 


epistemology,  and  still  has  no  generally  accepted  answer.  Gettier  (1963)  has  pointed  out 
counter-examples  to  some  widely  held  views. 

The  practical  reason  for  not  basing  our  formalism  on  the  notion  of  belief  is  that  we 
want  to  concentrate  on  issues  relating  to  actions,  and  the  effects  of  actions  on  belief  are 
much  harder  to  state  than  than  the  effects  of  actions  on  knowledge.  The  problem  is  that, 
while  knowledge  tends  to  be  cumulative,  belief  does  not.  ii  we  observe  or  perform  a 
physical  action,  we  generally  know  everything  we  knew  before,  plus  whatever  we  have 
learned  from  the  action.  Similarly,  if  someone  tells  us  something  true,  we  usually  gain  new 
knowledge  without  having  to  give  up  jny  old  knowledge. 

It  is  true  that  some  actions,  like  shuffling  a  pack  of  cards,  can  in  a  sense  reduce  our 
knowledge.  But  this  depends  on  the  frame  of  reference.  If  we  know  the  order  of  a  deck  of 
cards  at  time  t],  and  we  shuffle  the  cards  until  t2,  we  still  know  the  order  of  the  cards  at  tj. 
What  the  shuffling  does  is  to  prevent  us  from  acquiring  some  new  knowledge,  the  order  of 
the  cards  at  t2. 

On  the  other  hand,  if  the  results  of  an  action  or  the  contents  of  a  message  contradict  a 
previous  belief,  it  is  much  harder  to  say  what  happens.  If  the  new  information  is  to  be 
accepted,  then  certainly  the  contradictory  belief  must  be  given  up.  However,  individual 
beliefs  are  part  of  complex  belief  structures  which  may  have  to  undergo  global  adjustments 
in  order  to  remove  the  discarded  belief.  Even  true  beliefs  may  be  given  up  in  this  process, 

if  they  were  based  on  other  false  beliefs  (which  underscores  the  inadequacy  of  "true  belief- 
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type  theories  of  knowledge).  Furthermore,  there  is  the  problem  of  whether  the  new 
information  will  be  accepted  at  all,  since  it  contradicts  what  the  person  thinks  he  knows. 

These  issues  are  difficult  enough  for  a  system  to  handle  in  revising  its  own  beliefs, 
where  it  is  at  least  possible  for  the  system  to  know  the  dependency  structure  of  the  beliefs 
(Doyle,  1978).  To  replace  knowledge  by  belief  in  our  system,  however,  would  require  the 
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system  to  reason  about  how  some  other  agent  would  revise  his  beliefs,  without  necessarily 
knowing  the  relevant  dependencies.  People  often  deal  with  this  problem  by  trying  to  find 
out  what  those  dependencies  are.  For  instance,  in  trying  to  persuade  another  person  to  give 
up  some  belief  of  his,  someone  might  ask,  "Why  do  you  believe  that?”,  so  that  he  can  argue 
against  the  basis  for  the  be'  el  It  is  certainly  an  important  area  for  research  to  try  to  devise 
systems  with  these  capabilities,  but  it  would  lead  us  in  a  different  direction  from  the  one  we 
want  to  follow  in  this  thesis. 

2.5  Knowledge,  Equality,  and  Quantification 

The  formalization  of  knowledge  presented  so  far  is  purely  propositional.  A  number  of 
problems  are  encountered  when  we  attempt  to  extend  the  theory  to  handle  equality  and 
quantification.  The  first  person  to  recognize  the  special  problems  that  contexts  such  as 
knowledge  and  belief  present  for  the  logic  of  equality  was  Frege  (1892).  He  pointed  out 
that  since  the  phrases  ”the  morning  star”  and  ”the  evening  star”  both  refer  to  the  planet 
Venus,  according  to  Leibniz's  principal  of  substituting  equals  for  equals,  for  any  sentence 
containing  "the  morning  star",  the  corresponding  sentence  containing  “the  evening  star" 
ought  to  have  the  same  truth  value.  Yet  this  is  not  the  case.  The  sentence  "John  knows 
that  the  morning  star  is  a  body  illuminated  by  the  sun,"  may  be  true,  while  "John  knows 
that  the  evening  star  is  a  body  illuminated  by  the  sun,"  may  be  false,  if  John  does  not  know 
that  the  morning  star  and  the  evening  star  are  the  same. 

Frege's  solution  to  this  problem  depends  on  distinguishing  the  denotation  of  an 
expression  from  its  sense.  The  denotation  of  an  expression  is  the  object  in  the  world  to 
which  the  expression  refers.  In  the  case  of  "the  morning  star",  the  denotation  would  be 
Venus.  The  sense  of  an  expression  is  an  abstract  entity  "in  which  is  contained  the  manner 
and  context  of  presentation,"  (Frege,  1892,  p.  86).  Thus  “the  morning  star"  has  a  different 
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sense  from  "the  evening  star”,  because  the  first  attempts  to  present  an  object  as  the  star  seen 
in  the  morning,  while  the  second  attempts  to  present  an  object  as  the  star  seen  in  the 
evening.  We  have  to  qualify  these  statements  with  the  word  "attempts”,  because  a  phrase 
can  have  a  sense,  and  still  not  refer  to  anything.  Frege  gives  the  example  of  "the  series  with 
the  least  [i.e.,  slowest]  convergence". 

Having  made  this  distinction,  Frege  goes  on  to  assert  that  t.i  indirect  discourse,  in  which 
he  includes  knowledge  and  belief  contexts,  the  denotation  of  a  term  is  not  its  usual 
denotation.  Instead,  the  denotation  of  a  term  in  the  context  of  indirect  discourse  is  claimed 
to  be  the  usual  sense  of  the  term.  Since  the  usual  senses  of  "the  morning  star"  and  "the 
evening  star"  are  different,  Leibnitz's  law  does  not  apply  to  them  in  the  context  "John 

knows  that .  is  a  body  illuminated  by  the  sun."  Therefore  the  invalid  inference  which  we 

were  worried  about  cannot  be  made. 

Frege  does  not  go  beyond  this  informal  analysis  to  provide  us  with  a  logic  of  sense  and 
denotation  which  we  would  need  in  order  to  use  these  ideas.  Formally,  what  is  required  is 
that  in  the  logic  of  knowledge,  Know(A,P(B}>  and  (B  ■  C)  should  not  entail  Know(A,P(C)). 
Frege’s  sense/denotation  distinction  seems  to  be  adequate  for  this.  However,  we  also  want  to 
account  for  the  fact  that  Know(A,P(B))  and  Know(A,(B  ■  C)>  does  imply  (at  least  plausibly) 
Know(A,P(C)).  Frege  gives  us  no  help  here. 

The  possible-world  analysis  of  knowledge  provides  a  very  neat  solution  to  this  problem, 
once  we  realize  that  a  term  can  denote  different  objects  in  different  possible  worlds.  For 
instance,  it  might  be  possible  that  had  the  history  of  the  solar  system  been  different  it  would 
have  been  Mercury  that  was  "the  morning  star"  rather  than  Venus,  or  that  two  different 
planets  that  don’t  even  exist  in  the  actual  solar  system  would  have  been  "the  morning  star" 
and  "the  evening  star".  Thus,  we  will  say  that  an  equality  statement  such  as  (B  ■  C)  is  true 
in  a  possible  world  W  just  in  case  the  denotation  of  the  term  B  in  W  is  the  same  as  the 
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denotation  of  the  term  C  in  W.  This  is  a  special  case  of  the  more  general  rule  that  a 
formula  of  the  form  P(A|,..^An)  is  true  in  W  just  in  case  the  tuple  consisting  of  the 
denotations  in  W  of  the  terms  Aj,...,An  in  W  is  in  the  extension  in  W  of  the  relation  P.  In 
other  words,  we  fix  the  interpretation  of  "•"  in  all  possible  worlds  to  be  the  identity  relation. 

Now  everything  will  work  correctly.  If  Know(A,P(B))  and  Know(A,(B  ■  C))  are  true,  then 
in  all  worlds  which  are  compatible  with  what  A  knows  the  denotation  of  B  is  in  the 
extension  of  P  and  is  the  same  as  the  denotation  of  C,  hence  the  denotation  of  C  is  in  the 
extension  of  P.  But  from  this  we  can  infer  that  Know(A,P(C))  is  true.  If  (B  ■  C)  were  true, 
but  not  Know(A,(B  ■  C)),  then  the  denotation  of  B  would  be  the  same  as  the  denotation  of  C 
in  the  actual  world,  but  not  in  all  worlds  which  are  compatible  with  what  A  knows,  so  the 
inference  would  not  go  through.  Recalling  our  discussion  in  section  1.2  of  McCarthy  and 
Hayes’s  approach  to  this  problem,  we  can  see  that  we  wouldn't  need  to  have  different 
expressions  to  refer  to  someone's  idea  of  the  combination  of  a  safe  and  the  actual 
combination  as  they  propose.  We  would  simply  regard  the  denotation  of  the  expression  for 
the  combination  of  the  safe  as  depending  on  which  possible  world  it  is  evaluated  in.  The 
denotation  of  that  expression  could  well  be  different  in  the  actual  world  than  in  the  worlds 
which  are  compatible  with  what  someone  believes. 

The  introduction  of  quantifiers  also  causes  problems.  Suppose  John  is  trying  to  repair 
a  radio.  Consider  the  sentence  "John  knows  a  transistor  is  burned  out."  This  sentence  has 
at  least  two  interpretations.  The  first  is  that  John  knows  that  some  transistor  is  burr«i  jut, 
but  he  does  not  necessarily  know  which  one.  The  second  interpretation  is  that  there  is  a 
particular  transistor  which  John  knows  is  burned  out.  Sentences  such  as  this  were  first 
studied  by  Russell  (1905).  He  explained  the  ambiguity  by  analyzing  sentences  of  the  form 
"A  P  is  0,"  as  "'P(x)  and  Q(x)’  is  sometimes  true."  In  modem  notation,  we  would  write  this  as 
3x(P(x)  a  Q(x)).  So  a  sentence  of  the  form  "A  transistor  is  Q,"  would  be  formally  represented 
as  3x(Tr«nsi*tor(x)  a  QN)}. 
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Russell  goes  on  to  point  out  that  in  sentences  of  the  form  "John  knows  a  P  is  Q,"  the  rule 
for  eliminating  the  phrase  ”a  P”  can  be  applied  either  to  the  whole  sentence,  or  only  to  the 
subordinate  clause,  “a  P  is  Q."  Applying  this  observation  to  “John  knows  a  transistor  is 
burned  out,"  gives  us  the  following  two  formal  representations: 

(1)  Know(Johnt3x(Transistor(x)  a  Burrwd-out(x))) 

(2)  3x(Transistor(x)  a  Know(John,Burned*out(x))) 

The  most  natural  English  paraphrases  of  these  formulas  are  "John  knows  that  there  is  a 
burned  out  transistor,"  and  "There  is  a  transistor  which  John  knows  is  burned  out."  These 
seem  to  correspond  pretty  well  to  the  two  interpretations  which  we  identified  for  the  original 
sentence.  So.  the  ambiguity  in  the  original  sentence  is  mapped  into  an  uncertainty  as  to  the 
scope  of  the  operator  Know. 

There  is  another  possible  interpretation  of  "John  knows  a  transistor  is  burned  out," 
which  isn’t  accounted  for  by  Russell's  theory  of  how  English  sentences  of  the  form  "A  P  is 
Q."  are  expanded,  but  which  can  be  represented  in  this  notation,  namely: 

(3)  3x(Know(John,(Transittor(x)  a  Burnad-out(x)))). 

(3)  can  be  read  as  "There  is  something  that  John  knows  to  be  burned  out  and  (knows)  to  be 
a  transistor."  The  difference  between  (2)  and  (3)  is  that  (3)  asserts  that  John  knows  that  the 
thing  which  he  knows  is  burned  out  is  a  transistor,  while  (2)  simply  asserts  that  it  is  is  a 
transistor  without  asserting  whether  John  knows  that  it  is.  Thus  (2)  is  weaker  than  (3)  in 
that  it  would  be  true  if  John  knew  that  a  particular  transistor  was  burned  out  without 
knowing  that  it  was  a  transistor.  He  might  know  nothing  about  electronic  components  but 
see  smoke  rising  from  a  certain  object  and  say  to  himself,  "That  thing  (whatever  it  is)  is 
burned  out."  In  this  case,  it  would  be  correct  to  say  that  he  knows  of  a  particular  transistor 
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that  it  is  burned  out,  and  he  knows  that  something  is  burned  out,  but  he  does  not  know 
that  it  is  a  transistor  that  is  burned  out 

Following  a  suggestion  of  Hintikka  (1962),  we  can  use  a  formula  similar  to  (2)  or  (3)  to 
express  the  fact  that  someone  knows  who  or  what  something  is.  He  points  out  that  a 
sentence  of  the  form  "A  knows  who  (or  what)  B  is,"  intuitively  seems  to  be  equivalent  to 
"there  is  someone  (or  something)  that  A  knows  to  be  B.  But  this  can  be  represented 
formally  as  3x(Know(a,(B  ■  x))).  To  take  a  specific  example,  "John  knows  who  the  President 
is,"  can  be  paraphrased  as  "There  is  someone  whom  John  knows  to  be  the  President,"  which 
can  be  represented  by: 

(4)  3x  (Know  (John,  (President  ■  x)) 

In  (I),  Know  may  still  be  regarded  as  a  purely  propositional  operator,  although  the 
proposition  to  which  it  is  applied  now  has  a  quantifier  in  it.  Put  another  way.  Know  still  is 
used  simply  as  a  relation  between  a  knower  and  the  proposition  he  knows.  (2)  and  (4)  are 
not  so  simple.  In  these  formulas  there  is  a  quantified  variable  that  is  bound  outside  the 
scope  of  the  Know  operator,  but  has  an  occurrence  inside.  This  situation  is  usually  called 
"quantifying  in",  and  it  creates  problems  for  the  formal  interpretation  of  Know  as  a  relation 
between  a  knower  and  a  proposition. 

Consider  trying  to  apply  the  usual  Tarskian  notion  of  satisfij!-,.liry  (Rogers,  1971)  to  (2). 
This  is  of  the  form  3x(P),  so  we  must  bind  x  to  an  individual  that  makes  P  true.  In  this 
case  P  is  a  conjunction,  so  the  value  of  x  must  satisfy  both  conjuncts  of  P.  The  first 
conjunct  is  Trawistorfx),  which  we  chose  to  represent  the  fau  that  the  value  of  x  is  a 
transistor,  a  physical  object.  For  the  second  conjunct,  Know(JoK.i,Eurned-out(x)),  to  be  true. 
Burned-out  (x)  must  denote  a  proposition.  The  value  of  x  has  :o  be  a  physical  object  to 
satisfy  the  first  conjunct,  but  Frege’s  argument  shows  that  Burrwd-oul(x)  does  not  determine 


a  proposition  unless  the  value  of  x  tells  us  how  the  object  is  identified.  The  analysis  we 
have  does  not  supply  us  with  any  such  description,  so  we  are  stuck;  the  Tarskian  definition 
of  satisfiability  doesn't  work  here. 

The  possible-world  analysis,  however,  provides  us  with  a  very  natural  interpretation  of 
quantifying  in.  We  keep  the  standard  interpretation  that  3x(P)  is  true  Just  in  case  there  is 
some  value  for  x  that  satisfies  P.  If  P  is  Know(A,Q)  then  a  value  for  x  satisfies  P  Just  in  case 
that  value  satisfies  Q  in  every  world  that  is  compatible  with  what  A  knows.  So  (2)  is 
satisfied  if  there  is  a  particular  transistor  which  is  burned  out  in  every  world  that  is 
compatible  with  what  John  knows.  That  is,  in  every  such  world,  the  same  transistor  is 
burned  out.  On  the  other  hand,  (I)  is  satisfied  if  in  every  world  compatible  with  what 
John  knows  there  is  some  burned  out  transistor,  but  it  doesn't  have  to  be  the  same  one  in 
every  case.  In  either  situation,  there  is  no  problem  about  determining  a  proposition  from  a 
physical  object,  because  we  do  not  speak  of  propositions.  We  simply  talk  about  various 
possible  worlds  and  which  transistors  are  burned  out  in  those  worlds. 

This  analysis  does  require  us  to  talk  about  the  same  individual  existing  in  several 
different  possible  worlds,  which  may  seem  unintuitive,  but  as  Kripke  (1972)  has  pointed  out 
this  is  a  common  feature  of  ordinary  discourse.  When  we  say  that  Humphrey  would  have 
won  the  1968  Presidential  election  if  he  had  only  done  such-and-such,  we  are  really 
asserting  that  there  is  some  other  course  of  events  (i.e.,  another  possible  world)  in  which 
Humphrey  would  have  done  such-and-such  and  therefore  have  won  the  election. 
Furthermore,  we  really  do  mean  Humphrey,  the  very  same  individual  who  in  fact  lost  the 
election,  when  we  talk  about  this  other  possible  world.  Of  course,  there  are  other  possible 
worlds  in  which  Humphrey  does  not  exist.  The  best  way  to  think  about  this  is  in  terms  of 
one  universal  domain  of  possible  individuals  with  the  domains  of  particular  possible  worlds 
being  subsets  of  that  domain. 
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Notice  that  the  difference  between  (1)  and  (2)  has  been  transformed  from  a  difference  in 
the  relative  scopes  of  an  existential  quantifier  and  the  operator  Know  to  a  difference  in 
relative  scopes  of  an  existential  and  a  universal  quantifier  (the  "every"  in  "every  possible 
world  compatible  with...").  Recall  from  ordinary  first-order  logic  that  3x(Vy(P(x,y)))  entails 
VyQx(P(x,y)l)  but  not  vice  versa.  The  possible-world  analysis,  then,  implies  that  we  should 
be  able  to  infer  "John  knows  that  something  is  burned  out,"  from  "There  is  a  transistor 
that  John  knows  is  burned  out,"  as  indeed  we  can. 

When  we  look  at  how  this  analysis  applies  to  our  representation  for  "knowing  who"  we 
get  a  particularly  nice  picture.  We  said  that  A  knows  who  B  is  means  that  there  is  someone 
whom  A  knows  to  be  B.  If  we  analyze  this  we  conclude  that  there  is  a  particular  individual 
who  is  B  in  every  world  that  is  compatible  with  what  A  knows.  Suppose  this  were  not  the 
case,  and  in  some  of  the  worlds  compatible  with  what  A  knows  one  person  is  B  and  in  the 
others,  some  other  person  Is  B.  In  other  words,  for  all  that  A  knows,  either  of  these  two 
people  might  be  B.  But  this  is  exactly  what  we  mean  when  we  say  A  dotsn’t  know  who  B  is! 
Basically,  the  possible-world  view  gives  us  the  very  natural  picture  that  A  knows  who  B  is  if 
A  has  the  possibilities  for  B  narrowed  down  to  a  single  individual. 

There  is  at  least  one  more  consequence  of  this  analysis  that  is  worth  noting.  Suppose 
that  A  knows  who  B  is  and  who  C  is.  Then  the  denotation  of  B  is  the  same  in  all  the 
worlds  which  are  compatible  with  what  A  knows,  and  the  same  is  true  for  C.  Since  in  all 
these  worlds,  B  and  C  each  have  only  one  denotation,  they  either  have  the  same  denotation 
everywhere  or  different  denotations  everywhere.  Thus,  either  (B  ■  C)  is  true  in  every  world 
compatible  with  what  A  knows  or  (B  /  C)  is.  From  this  we  can  infer  that  either  A  knows 
that  B  and  C  are  the  same  individual  or  that  they  are  not.  The  end  conclusion  is  that  if  A 
knows  who  both  B  and  C  are,  he  must  know  whether  they  are  the  same  person,  another 
very  intuitive  result. 


We  now  have  a  coherent  account  of  quantifying  in  that  does  not  talk  about  knowing 
particular  propositions.  Still,  in  many  cases  there  will  be  a  certain  proposition  such  that 
knowing  that  proposition  counts  as  knowing  something  which  we  would  express  by 
quantifying  in.  For  instance,  the  proposition  that  John  knows  that  Bill's  telephone  number 
is  321-1234  might  be  represented  as: 

(5)  Know  (John,  (Phone-num(Bill)  ■  321*1234)), 

which  does  not  involve  quantifying  in.  We  want  to  be  able  to  infer  from  this,  however, 
that  John  knows  what  Bill's  telephone  number  is,  which  would  be  represented  as: 

(6)  3x(Know(John,(Phono-num(Bili)  ■  x))). 

It  might  seem  that  (6)  can  be  derived  from  (5)  simply  by  the  logical  principle  of 
existential  generalization  (EC),  but  the  situation  is  more  complicated  than  that.  Suppose 
that  (5)  were  not  true,  but  instead,  John  simply  knew  that  Bill  and  Mary  had  the  same 
telephone  number.  We  could  represent  this  as: 

(7)  Know(John,(Phono*num(Bili)  ■  Phon«*num(Mary))). 

It  is  clear  that  we  would  not  want  to  infer  from  (7)  that  John  knows  Bill's  telephone 
number,  yet  if  we  can  get  (8)  from  (5)  by  EC,  then  we  ought  to  be  able  to  get  (6)  from  (7)  by 
the  same  process. 

To  take  another  example,  suppose  there  is  a  collection  of  blocks  that  John  knows 
something  about.  If  John  knows  that  the  number  of  cubes  is  greater  than  ten,  then  there  is 
a  number  such  that  John  knows  that  the  number  of  cubes  is  greater  than  that  number.  If, 
on  the  other  hand,  all  that  John  knows  is  that  the  number  of  cubes  is  greater  than  the 
number  of  pyramids,  then  there  may  not  be  any  number  such  that  John  knows  the  number 
of  cubes  is  greater  than  that  number. 


It  seems  then  that  EC  can  be  applied  to  occurrences  in  knowledge  contexts  of  the  terms 
which  represent  "321- 1234"  and  "ten"  but  not  the  terms  which  represent  "Mary's  telephone 
number"  and  "the  number  of  pyramids’.  What  is  the  difference  in  these  cases?  The 
difference  seems  to  be  that  "321-1234"  and  "ten"  are  standard  names  for  the  things  they 
refer  to,  whereas  "Mary’s  telephone  number"  and  "the  number  of  pyramids"  are  not  A 
standard  name  can  be  thought  of  as  a  name  such  that  knowing  what  the  name  denotes  is 
part  of  knowing  the  language  that  the  name  occurs  in.  Thus,  not  to  know  which  number  is 
the  number  of  cubes  in  a  certain  collection  is  to  be  ignorant  of  a  certain  feature  of  the 
world.  Not  to  know  which  number  is  ten  is  to  be  ignorant  of  part  of  English,  namely  the 
meaning  of  the  word  "ten". 

Now  we  can  show  formally  why  EC  works  for  standard  names  in  knowledge  contexts 
even  though  it  doesn’t  work  in  general.  Suppose  John  knows  that  P(B)  is  true,  where  B  is  a 
standard  name. 

(8)  Know(John,P(B)) 

Since  B  is  a  standard  name,  it  is  part  of  knowing  the  language  to  know  what  B  is;  so  we  will 
assume  that  everyone,  including  John,  knows  what  B  is: 

(9)  3x(Know(John,(B  ■  x)» 

By  ordinary  first-order  logic,  we  can  conjoin  (8)  and  (9),  bringing  (8)  inside  the  scope  of  the 
quantifier  in  (9).  This  is  valid  because  (8)  does  not  contain  any  free  occurrences  of  the 
variable  in  (9): 

(10)  3x(Know(John,(B  ■  x)  a  Know(John,P(B)» 

Now,  using  the  results  on  equality  substitution  that  we  developed  in  the  first  part  of  this 
section,  we  can  substitute  x  for  B  in  P(B>. 


(1 1 )  3x(Know(John,P(x))) 


It  should  be  noted  that  we  are  not  claiming  that  the  only  way  of  knowing  who  or 
knowing  what  is  to  know  a  proposition  that  contains  a  standard  name.  For  Instance  if  John 
picks  up  an  unusual  rock  and  puts  it  in  his  pocket,  we  would  not  want  to  claim  that  John 
doesn’t  know  what  is  in  his  pocket  just  because  there  is  no  standard  name  in  the  language 
for  that  particular  rock.  To  say  exactly  what  the  other  ways  of  knowing  who  or  knowing 
what  are  is  one  of  the  basic  problems  of  epistomology,  and  is,  therefore,  beyond  the  scope  of 
this  thesis.  As  with  the  concept  of  knowledge  itself,  we  have  a  formalism  that  makes  some 
intuitively  plausible  predictions,  and  any  epistemological  theory  that  explains  these 
prediction  would  be  acceptable. 

In  terms  of  possible  worlds,  standard  names  have  a  very  straightforward  interpretation. 
Standard  names  are  simply  terms  that  have  the  same  denotation  in  every  possible  world.  If 
the  denotation  of  a  standard  name  is  fixed  by  the  language  alone,  then  no  matter  what 
possible  world  we  are  talking  about,  the  name  must  have  that  denotation.  Following  Kripke 
<1972),  we  will  call  terms  that  have  the  same  denotation  in  every  possible  world  rigid 
designators.  The  conclusion  that  standard  names  are  rigid  designators  seems  inescapable. 
How  could  any  expression  be  a  standard  name,  a  canonical  identifier,  of  an  individual  if 
under  some  circumstances  that  expression  refers  to  something  else? 

The  validity  of  EC  for  standard  names  follows  immediately  from  this  definition.  The 
possible  world  analysis  of  Know(John,P(B))  is  that  in  every  world  which  is  compatible  with 
what  John  knows,  the  denotation  of  B  in  that  world  is  in  the  extension  of  P  in  that  world. 
EC  fails  because  we  are  unable  to  conclude  that  there  is  any  particular  individual  which  is 
in  the  extension  of  P  in  all  the  relevant  worlds.  If  B  is  a  rigid  designator,  however,  the 
denotation  of  B  is  the  same  in  every  world,  so  it  is  the  same  in  every  world  compatible  with 
what  John  knows,  and  that  denotation  is  an  individual  which  is  in  the  extension  of  P  in  all 


those  worlds. 
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In  addition  to  rigid  designators  which  are  simple  constants,  we  will  also  need  to  have 
terms  built  up  using  rigid  functions.  Rigid  functions  will  have  the  property  that  if  the 
arguments  to  the  function  are  rigid  designators,  then  the  term  consisting  of  the  function 
applied  to  the  arguments  is  also  a  rigid  designator.  We  will  make  considerable  use  of  this 
notion  in  section  3.2  when  we  formalize  the  notion  of  an  action  having  knowledge 
preconditions.  If  Act(xj,..j«n)  represents  a  general  action  that  we  assume  anyone  can 
perform,  then  we  will  treat  Act  as  a  rigid  function.  In  our  possible-world  semantics  for 
actions,  this  will  have  the  consequence  that  an  agent  knows  how  to  perform  some  specific 
instance  of  Act,  just  in  case  he  knows  what  individuals  the  arguments  of  Act  refer  to. 

Saying  that  a  standard  name  is  a  term  whose  denotation  is  determined  by  the  language 
it  occurs  in  leaves  open  the  question,  "What  language?"  It  would  be  possible  to  have  two 
languages  that  were  identical  in  syntax  and  semantics,  except  that  some  of  the  terms  which 
were  standard  names  in  one  language  were  not  standard  names  in  the  other.  In  fact,  this 
happens  all  the  time.  The  linguistic  community  comprising  the  users  of  any  natural 
language  will  contain  subcommunities  who  use  certain  terms  as  standard  names  that  are  not 
shared  by  the  larger  community.  This  is  done  more  or  less  formally  in  professional  and 
scientific  disciplines,  and  informally  in  other  contexts. 

This  is  an  Important  point  for  AI  systems,  because  they  usually  assume  that  there  is  a 
common  vocabulary  shared  by  the  system  and  the  user  that  goes  beyond  the  bare  essentials 
of  the  basic  language.  For  example,  in  systems  for  analysis  of  electronic  circuits  [e.g., 
(Stallman  and  Sussman,  1976),  (Brown,  1977)3,  individual  components  and  nodes  are 
typically  assigned  identifiers  that  function  as  standard  names.  If  Q301  is  such  an  identifier 
and  the  transistor  it  refers  to  is  burned-out,  then  the  system  knows  which  transistor  is 
burned  out  only  if  it  knows  that  Q30I  is  burned  out.  We  will  make  use  of  this  notion  of 
specialized  vocabularies  of  standard  names  in  our  examples.  Standard  names  for  objects 


mentioned  by  an  example  will  be  freely  introduced  whenever  the  identity  of  that  object  is 
not  directly  relevant  to  the  point  which  we  are  using  the  example  to  illustrate.  It  should  be 
noted,  however,  that  this  is  never  essential  for  making  the  examples  work.  We  could  always 
eliminate  the  assumption  that  the  identifiers  are  standard  names  by  adding  explicit 
assertions  that  the  agent  in  the  example  knows  what  objects  the  identifiers  refer  to. 

There  are  a  few  more  observations  to  be  made  about  standard  names  and  rigid 
designators.  First,  in  describing  standard  names  we  assumed  that  everyone  knew  what  they 
referred  to.  Identifying  them  with  rigid  designators  makes  the  stronger  claim  that  what 
they  refer  to  is  common  knowledge.  That  is,  not  only  does  everyone  know  what  a  particular 
standard  name  denotes,  but  everyone  knows  that  everyone  knows,  etc  Second,  although  it 
is  natural  to  think  of  any  individual  having  a  unique  standard  name,  this  is  not  required 
by  our  theory.  What  the  theory  does  require  is  that  if  there  are  two  standard  names  for  the 
same  individual,  it  will  be  common  knowledge  that  they  name  the  same  individual. 

Finally,  there  is  a  question  about  how  a  rigid  designator  can  refer  to  the  same 
individual  in  all  possible  worlds,  when  that  individual  may  not  exist  in  some  of  those 
worlds.  At  first  glance,  the  idea  that  the  denotation  of  a  term  in  a  possible  world  could  be 
something  that  does  not  exist  in  that  world  seems  paradoxical,  but  an  examination  of 
ordinary  discourse  shows  that  we  very  often  say  things  which  are  most  naturally  analyzed  in 
this  way.  We  can  talk  about  things  like  "the  largest  tower  that  could  be  built  out  of  these 
six  blocks,"  and  if  there  is  only  one  way  of  arranging  the  blocks  to  fit  this  description,  it  will 
denote  a  well-defined  possible,  though  nonexistent,  individual.  In  fact,  we  frequently 
quantify  over  possible  individuals  as  well,  as  when  we  ask  how  many  possible  towers  could 
be  constructed  with  a  certain  group  of  blocks,  or  whether  any  of  them  would  be  over  ten 
inches  high. 

If  we  let  the  quantifiers  in  our  formalism  range  over  possible  Individuals,  then  we  can 
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allow  rigid  designators  for  possible  individuals  that  do  not  actually  exist  and  still  allow 
existential  generalization  over  rigid  designators,  without  "precipitating  an  ontological  crisis". 
If  we  want  to  treat  S ants-CIi us  as  a  rigid  designator,  then  we  can  infer: 

(12)  3x(Believes(John,Lives-at(x, North-Pole))) 
from: 

(13)  Believes(John,Lives-at(Santa-Claus, North-Pole)) 

without  claiming  that  Santa  Claus  actually  exists.  We  are  merely  claiming  that  Santa  Claus 
might  have  existed. 

In  such  a  system,  since  existential  quantifiers  indicate  possible  existence,  to  talk  about 
actual  existence  we  would  need  a  predicate  whose  extension  in  each  possible  world  is  the 
subset  of  all  the  possible  individuals  who  actually  exist  in  that  world.  Most  universally 
quantified  statements,  however,  would  not  need  modification  so  long  as  ordinary  predicates 
are  restricted  to  actual  individuals.  Thus,  "All  men  are  mortal,"  could  still  be  true  even  if 
there  are  possible  men  who  are  immortal,  so  long  as  the  predicate  "men"  picks  out  just  the 
men  who  actually  exist  in  the  possible  world  that  the  sentence  is  evaluated  in. 

2.6  Other  Work  on  Reasoning  about  Knowledge 

Although  the  work  cited  in  section  1.1  seems  to  be  the  only  previous  work  in  A I  to 
consider  the  interaction  of  knowledge  and  action,  there  has  been  slightly  more  done  on 
reasoning  about  knowledge  alone.  Most  of  this  work  is  due  to  John  McCarthy  and  his 
students,  much  of  it  unpublished.  As  we  mentioned  in  section  1.2,  in  addition  to  their 
unsatisfactory  attempt  to  formalize  knowledge  prerequisites  for  actions,  McCarthy  and 
Hayes  (1969)  review  Hintikka's  work  on  the  logic  of  knowledge  and  make  the  point  that  the 
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possible-world  semantics  for  this  logic  could  be  formalized  in  first-order  logic  This  seems 
to  be  the  first  time  that  this  idea  appears  in  the  A I  literature,  but  they  do  not  pursue  the 
approach.  They  do  make  a  tantalizing  reference  to  another  idea  that  is  crucial  to  our 
efforts  to  integrate  knowledge  and  action,  namely  identifying  possible  worlds  in  the 
semantics  for  knowledge  with  situations  in  their  formalism  for  actions.  They  give  no 
examples  that  make  use  of  this  identification,  however,  so  it  is  not  clear  that  they  have  in 
mind  the  same  interpretation  that  we  will  use.  Subsequently  McCarthy  <1978)  recalled  that 
they  abandoned  this  idea,  because  they  could  not  see  how  to  express  that  someone  knows 
the  effect  of  an  action  if  possible  worlds  and  situations  are  identified.  As  we  will  sec  in 
chapter  3,  our  approach  does  not  suffer  from  this  problem. 

Sato  (1976)  uses  some  techniques  due  to  Centzen  to  prove  some  very  general  results 
about  the  soundness,  completeness,  and  decidability  of  various  propositional  modal  logics 
with  respect  to  Kripke-style  possible-world  semantics.  He  then  applies  these  results  to  some 
puzzles  in  the  logic  of  knowledge,  using  an  axiomatization  due  to  McCarthy.  Since  we  will 
be  using  the  model  theory  directly  in  our  system,  his  results  on  the  properties  of  the  modal 
logics  themselves  do  not  seem  directly  relevant  to  our  work.  Sato’s  work  is  interesting, 
however,  in  that  he  does  integrate  a  simple  logic  of  time  into  his  logic  of  knowledge. 
Because  he  does  not  identify  possible  worlds  with  possible  situations,  though,  all  formulas  in 
his  system  are  required  to  be  definite  as  to  time.  In  chapter  3  we  will  explain  why  this  is 
the  case  and  what  problems  it  creates. 

In  an  unpublished  note,  McCarthy  (1975)  uses  the  possible-world  semantics  developed 
by  Sato  to  construct  a  first-order  axiomatization  of  knowledge  in  exactly  the  same  way  that 
we  will  use  Hintikka's.  He  then  uses  this  axiomatization  to  give  a  formal  proof  of  the 
solution  to  a  puzzle  involving  reasoning  about  knowledge.  Goad  (1976)  uses  the  same  ideas 
to  formalize  some  interesting  problems  in  reasoning  about  lack  of  knowledge  which  we  will 
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discuss  in  chapter  8.  As  we  have  mentioned,  the  main  limitation  of  this  work  is  that  it 
deals  only  with  the  propositional  part  of  the  logic  of  knowledge,  while  handling  quantifiers 
and  equality  will  be  essential  for  the  problems  we  wish  to  solve. 

More  recently  McCarthy  (1979)  (summarized  in  McCarthy  (1977b))  uses  a  completely 
different  approach  to  handle  the  problem  of  referential  transparency  in  knowledge  contexts. 
He  proposes  that  the  notion  of  knowing  a  phone  number,  for  instance,  be  regarded  as  a 
relation  between  the  knower  and  the  concept  of  the  phone  number,  rather  than  as  a  relation 
between  the  knower  and  the  phone  number  itself.  This  allows  McCarthy  to  account  for  the 
fact  that  although  Mary  and  Bill  may  have  the  same  phone  number,  it  does  not  follow  from 
the  fact  that  John  knows  Mary's  phone  number  that  John  knows  Bill's  phone  number. 
This  point  is  that  while  the  phone  numbers  are  the  same,  the  concept  of  Mary's  phone 
number  is  distinct  from  the  concept  of  Bill's  phone  number.  Thus  knowing  one  of  them 
does  not  imply  knowing  the  other. 

This  is  just  the  problem  described  by  Frege  (1892)  which  we  discussed  in  section  2  5, 
and  McCarthy's  concept/object  distinction  seems  to  be  essentially  the  same  as  the 
sense/denotation  distinction  which  Frege  proposed  as  a  solution  to  the  problem.  McCarthy's 
proposal  is  interesting,  though,  because  he  formalizes  these  ideas  entirely  within  first-order 
logic  by  treating  concepts  as  individuals.  McCarthy  gives  many  examples  of  representations 
of  various  types  of  statements,  but  since  he  axiomatizes  few  of  the  properties  of  knowing 
and  presents  few  deductions  in  the  system,  it  is  difficult  to  evaluate  these  methods. 

None  of  the  work  cited  above  deals  with  the  problem  of  automatically  generating 
deductions  about  knowledge.  The  only  previous  work  which  comes  close  to  this  problem  is 
by  Morgan  (1976).  Morgan  deals  with  purely  abstract  modal  logics,  rather  than  the  logic  of 
knowledge  specifically,  but  the  general  issues  are  the  same  in  either  case.  He  presents  two 
methods  for  using  standard  theorem  proving  techniques  to  prove  theorems  in  modal  logic 


One  of  these  is  to  axiomatize  the  possible-world  semantics  of  the  logic  in  exactly  the  same 
way  as  is  done  here  and  in  M'-.arthy's  formalism.  The  other  method,  which  he  calls  the 
syntactic  method,  is  to  make  sentences  of  the  modal  logic  into  terms  in  a  first-order  logic, 
and  introduce  the  predicate  PR(P)  which  means  that  P  is  provable  in  the  modal  logic  The 
only  axioms  necessary  are  one  axiom  of  the  form  PR(P)  for  each  axiom  or  axiom  schema  P 
in  the  modal  logic,  and  one  axiom  for  each  rule  of  inference  in  the  modal  logic  For 
instance,  (PR(impli«s(pj,p2))  s  (PR(pj)  a  PR(p2)))  would  represent  modus  ponens.  Morgan 

then  feeds  these  axioms  to  a  simple  resolution  theorem  prover,  and  for  any  formula  P  which 
he  wishes  to  prove  in  the  modal  logic,  he  tries  to  prove  PR(P)  in  the  resolution  system. 

The  trouble  with  this  approach  is  that  it  suffers  the  same  problems  of  efficiency  that  we 
saw  in  section  22  when  we  explored  the  consequences  of  adding  modal  logic  axioms  to  a 
standard  deduction  system.  Morgan's  idea  runs  into  the  same  difficulty  that  most  of  the 
control  of  the  deductive  process  actually  resides  in  the  axioms  and  rules  of  the  modal  logic 
rather  than  in  the  theorem  prover,  and,  as  we  pointed  out  before,  these  formalisms  are 
usually  designed  to  be  concisely  stated,  rather  than  to  be  efficient  in  generating  proofs.  To 
see  what  can  happen  in  this  type  of  situation,  consider  using  the  axiom  which  describes 
modus  ponens  to  try  to  prove  an  arbitrary  goal  P.  Resolving  this  goal  against  that  axiom 
would  produce  a  new  goal  of  trying  to  find  some  other  formula  Q,  such  that  Q  and  (Q  =»  P) 
can  be  proved.  Whichever  of  these  goals  we  attack  first,  the  modus  ponens  axiom  applies 
again,  producing  still  more  complicated  goals.  It  is  possible  to  continue  in  this  way  to  an 
arbitrary  depth  without  ever  touching  ground,  so  to  speak. 

Morgan  does  not  seem  to  be  aware  of  the  possibilities  for  this  sort  of  behavior,  although 
he  does  note  somewhat  innocently  that  he  was  able  to  prove  certain  theorems  using  the 
possible-world  approach  that  he  had  not  been  able  to  prove  using  the  syntactic  approach. 
What  is  really  disappointing  about  Morgan's  work,  however,  is  that  he  gives  no  suggestions 
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for  efficient  use  of  either  approach  other  than  to  simply  turn  loose  a  uniform  resolution 
theorem  prover  on  them.  The  main  point  of  chapters  6  and  7  of  this  thesis  will  be  to  try  to 
improve  on  that  idea. 
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S.  An  Integrated  Theory  of  Knowledge  and  Action 

S.1  Possible-World  Semantics  for  Actions 

In  the  preceding  sections,  we  presented  a  theory  for  tatking  about  knowledge  in  terms  of 
possible  worlds.  If  we  are  to  capture  the  interactions  between  knowledge  and  action,  we 
need  a  theory  of  actions  in  these  same  terms.  Happily,  the  standard  AI  way  of  looking  at 
actions  gives  us  exactly  that  Most  AI  programs  that  reason  about  actions  view  the  world  as 
a  set  of  possible  situations,  where  each  action  determines  a  binary  relation  on  situations  • 
one  situation  being  the  outcome  of  performing  the  action  in  the  other  situation.  We  will 
integrate  knowledge  and  action  by  identifying  the  possible  worlds  that  are  used  to  describe 
knowledge  with  the  possible  situations  that  are  used  to  describe  actions. 

The  identification  of  possible  worlds  with  situations  is  somewhat  nonstandard  in 
possible-world  semantics.  Usually  a  possible  world  is  thought  of  as  including  an  entire 
course  of  events.  For  example,  we  might  say  that  in  some  possible  worlds  the  European 
discovery  of  America  occurs  100  years  later  than  it  actually  did.  It  might  seem  that  taking 
possible  worlds  to  be  situations,  and  therefore  not  extended  in  time,  might  make  it  difficult 
to  talk  about  what  someone  knows  about  the  past  or  future.  That  is  not  the  case,  however. 
Knowledge  about  the  past  and  future  can  be  handled  by  modal  tense  operators  which  have 
corresponding  accessibility  relations  on  possible  situation/worlds.  We  could  have  a  tense 
operator  Future,  such  that  Future(P)  means  that  P  will  be  true  at  some  time  to  come.  If  we 
let  F  be  an  accessibility  relation  such  that  means  that  the  situation/world  W2  lies 

in  the  future  of  the  situation/world  Wj,  then  we  can  define  Future(P)  to  be  true  in  W|  Just 
in  case  there  is  some  W2  such  that  FfWj^)  holds  and  P  is  true  in 


This  much  is  standard  tense  logic,  as  in  Rescher  and  Urquhart  (1971).  The  interesting 


point  is  that  statements  about  someone’s  knowledge  of  the  future  work  out  exactly  right, 
even  though  knowledge  is  analyzed  in  terms  of  alternatives  to  a  situation,  rather  than 
alternatives  to  a  course  of  events.  The  proposition  that  John  knows  that  P  will  be  true  is 
represented  simply  by  Know(John,Future(P)).  The  analysis  of  this  is  that  Futur«(P)  is  true  in 
every  situation  which  is  compatible  with  what  John  knows,  from  which  it  follows  that,  for 
each  si  uation  which  is  compatible  with  what  John  know,.,  p  is  true  in  some  future 
alternative  to  that  situation.  An  important  point  to  note  here  is  that  two  situations  can  be 
"internally”  similar  (that  is,  they  agree  in  truth  value  for  all  nonmodal  statements),  but  be 
distinct  because  they  differ  in  their  accessibility  relations  to  other  possible  situations.  So 
although  we  treat  a  possible  world  as  a  situation  rather  than  a  course  of  events,  it  is  a 
situation  in  the  particular  course  of  events  defined  by  its  relationships  to  other  situations. 

It  turns  out  that  treating  possible  worlds  as  situations  is  actually  a  more  flexible  way  of 
handling  time  than  treating  possible  worlds  as  courses  of  events.  If,  following  Sato  (1976), 
we  formalize  possible  worlds  as  extending  over  time  and  identify  a  person’s  knowledge  with 
the  set  of  propositions  which  are  true  in  every  possible  world  which  is  compatible  with 
what  he  knows,  then  all  propositions  will  have  to  be  specific  as  to  what  times  they  refer  to. 
Thus,  "A  is  on  B,"  would  not  be  a  proposition  because  it  does  not  have  a  unique  truth  value 
in  each  possible  world.  In  some  worlds  it  will  be  true  at  some  points  in  time  and  false  in 
others.  So,  only  sentences  like  "A  is  on  B  at  time  T,”  would  express  definite  propositions.  As 
a  result,  we  could  not  say  that  at  time  T  (or  in  situation  W)  John  knows  that  A  is  on  B. 
Instead  we  would  have  to  say  that  at  time  T  John  knows  that  at  time  T  A  is  on  B. 
Furthermore,  since  a  person  can  know  something  at  one  time  that  he  did  not  know  at  an 
earlier  time,  the  accessibility  relation  for  knowledge,  K,  will  have  to  have  an  extra  argument 
position  for  the  time  in  question. 

For  a  planning  system  this  distinction  is  extremely  important  Suppose  the  system  has 
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the  goal  of  bringing  about  P,  and  it  knows  that  if  Q  is  true,  performing  Act  will  cause  P  to 
be  true.  If  its  knowledge  about  the  truth  of  Q  is  limited  to  statements  of  the  form  Q  is  true 
at  time  T,  there  is  a  problem,  because  the  system  has  no  way  to  represent  what  time  T  is  with 
respect  to  the  time  at  which  the  system  is  doing  its  planning.  What  it  needs  to  know  is  that 
Q  is  true  now,  i.e.,  Q  is  simply  true,  but  this  is  not  representable  in  the  logic  The  logic 
cannot  have  a  formula  such  as  T  ■  Now  either,  since  this  is  not  a  statement  that  can  be  true 
at  all  times  in  a  given  possible  world.  Of  course,  the  effect  of  having  such  a  fact  can  be 
built  into  the  planner,  but  it  is  necessary  to  go  outside  of  the  logic  to  do  it 

For  reasoning  about  actions,  instead  of  a  tense  operator  like  Future,  which  simply  says  ' 
what  will  be  true,  we  need  an  operator  that  talks  about  what  would  be  true  if  a  certain 
action  were  performed.  Our  approach  will  be  to  recast  McCarthy's  situation  calculus 
(McCarthy,  1963),  (McCarthy  and  Hayes,  1969),  to  mesh  with  our  possible-world  approach 
to  reasoning  about  knowledge.  The  situation  calculus  is  a  first-order  language  in  which 
predicates  which  can  vary  in  truth  value  over  time  are  given  an  extra  argument  to  say  what 
situations  they  hold  in,  and  there  is  a  function  Result  that  maps  an  agent,  an  action,  and  a 
situation  into  the  situation  which  results  from  the  agent  performing  the  action  in  the  first 
situation.  Statements  about  the  effects  of  actions  are  then  expressed  by  formulas  like 
P(Retult<A, Act,S)|,  which  means  that  P  is  true  in  the  situation  that  results  from  A  performing 
Act  in  situation  $. 

In  order  to  Integrate  these  ideas  into  our  logic  of  knowledge,  we  will  redefine  the 
situation  calculus  as  a  modal  logic  We  will  introduce  a  modal  operator  Res  for  talking 
about  the  results  of  actions,  parallel  to  the  modal  operator  Know  for  talking  about 
knowledge.  Situations  will  not  be  referred  to  explicitly  in  this  language,  but  they  will 
reappear  when  we  specify  the  possible-world  semantics  for  Res  and  formalize  that  semantics 
in  first-order  logic.  We  will  let  Res  take  as  its  arguments  a  description  of  an  event  and  a 


formula,  such  that  R«s(Ev,P)  means  that  if  the  event  described  by  Ev  occurs,  the  formula  P 
will  then  be  true.  The  possible-world  semantics  for  Rm  will  be  specified  in  terms  of  an 
accessiblity  relation  R,  parallel  to  K,  such  that  R(:EvtWj,W2)  means  that  W2  is  the 
situation/world  that  would  result  from  the  event  iEv  happening  in  Wj.  We  need  to 
distinguish  between  expressions  that  represent  event  descriptions  <e.gn  Ev)  and  expressions 
that  represent  events  (eg.,  :Ev),  because  the  same  event  description  may  refer  to  different 
events  in  different  possible  worlds.  (Generally,  if  X  is  a  symbol  in  the  modal  language,  *X 
will  be  the  corresponding  symbol  in  the  possible-wor'd  language.)  For  example, 
Dial(Combination(Sf | ))  will  refer  to  different  sequences  of  dial  twisting  in  worlds  where  the 

combination  of  Sf]  differs. 

We  will  assume  that  if  it  is  impossible  for  :Ev  to  occur  in  Wj  (i.e.,  the  preconditions  of 
iEv  are  not  satisfied),  then  there  is  no  W2  such  that  R(:Ev,W|,W2)  holds.  Otherwise,  we 
assume  that  there  is  exactly  one  W2  such  that  (lEv.WpV^)  holds.  Formally,  this  amounts  to 
an  assumption  that  all  events  are  deterministic,  which  might  seem  to  be  an  unnecessary 
limitation.  Pragmatically,  however,  it  doesn't  matter  whether  we  say  that  a  given  event  is 
nondeterministic,  or  that  it  is  deterministic,  but  no  one  knows  precisely  what  the  outcome 
will  be.  If  we  treated  events  as  being  deterministic,  we  could  say  that  someone  knows 
exactly  what  situation  he  is  in,  but  doesn't  know  what  situation  would  result  if  :Ev  occurs, 
because  tEv  is  nondeterministic.  It  would  be  completely  equivalent,  however,  to  say  that  :Ev 
is  deterministic,  and  that  this  person  doesn’t  know  exactly  what  situation  he  is  in  because  he 
doesn't  know  what  the  result  of  :Ev  would  be  in  that  situation. 

Giving  a  possible-world  semantics  for  Res  requires  specifying  which  possible  worlds  a 
formula  of  the  form  R««(Ev,P)  is  true  in.  (If  we  want  to  say  that  R*s(Ev,P)  is  simply  true,  we 
have  to  say  that  it  is  true  in  the  actual  world.)  With  the  assumptions  we  have  just  made. 
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we  can  say  that  Res(Ev,P)  is  true  in  Wj  Just  in  case  there  is  some  W2  which  is  the 
situation/world  that  results  from  the  event  described  by  Ev  happening  in  Wj,  and  in  which 

P  is  true.  Because  this  definition  involves  an  existential  quantifier,  we  get  a  strong 
interpretation  of  Res  in  that  it  must  be  possible  for  the  event  described  by  Ev  to  occur  for 
Res(Ev,P)  to  be  true.  There  is  a  corresponding  weak  interpretation  which  is  noncommital  as 
to  whether  it  is  possible  for  Ev  to  occur,  but  asserts  that  if  it  did,  P  would  be  true  in  the 
resulting  situation.  This  weaker  interpretation  is  obtained  by  saying  that  P  must  be  true  in 
every  situation  that  results  from  the  occurence  of  the  event  described  by  Ev.  We  will  give 
this  weaker  interpretation  to  the  operator  Rasl. 

With  this  definition,  we  can  develop  a  theory  based  on  Res  and  Resl  parallel  to  our 
theory  of  Know,  and  then  convert  the  theory  into  first-order  logic  An  instance  of  R«t(Ev,P) 
whose  truth  is  to  be  determined  relative  to  Wj  will  be  replaced  by  the  corresponding 
instance  of: 

3w2(R(:Ev,W|  ,w2)  a  T(w2,P)), 

and  an  instance  of  R««l(Ev,P)  whose  truth  is  to  be  determined  relative  to  W|  will  be 
replaced  by  the  corresponding  instance  of: 

Vw2(R(sEv,W„w2)  o  T(w2,P)). 

Both  of  these  can  be  handled  in  the  same  way  as  the  possible-world  transformations  of 
formulas  containing  Know. 

The  type  of  event  we  will  normally  be  concerned  with  is  an  agent  performing  an  action. 
We  will  let  Do(A,Act)  be  a  description  of  the  event  consisting  in  the  agent  named  by  A 
performing  the  action  named  by  Act  We  will  assume  that  the  set  of  possible  agents  is  the 
same  as  the  set  of  possible  knowers.  Do  will  be  a  rigid  function,  so  Do(A,Act)  will  be  the 


standard  name  of  an  event  if  A  is  the  standard  name  of  an  agent  and  Act  is  the  standard 
name  of  an  action. 

It  would  be  more  precise  to  say  that  Oo(A,Act)  names  a  type  of  event  rather  than  an 
individual  event,  since  an  agent  can  perform  the  same  action  on  different  occasions.  We 
would  then  say  that  Ret  and  R  are  relations  on  event  types.  We  will  let  the  present  usage 
stand,  however,  since  we  will  not  need  to  distinguish  Individual  events  in  this  thesis. 

Most  actions  can  be  thought  of  as  a  general  procedure  applied  to  some  specific  objects. 
These  general  procedures  will  be  represented  by  functions  which  map  the  objects  the 
procedure  is  applied  to  into  the  action  of  applying  the  procedure  to  those  objects.  For 
instance,  if  Dial  represents  the  general  procedure  of  dialing  combinations  of  safes,  C| 
represents  a  combination,  and  SI]  represents  a  safe,  then  OiaKC^Sfj)  represents  the  action  of 
dialing  the  combination  C]  on  the  safe  Sl|. 

This  formalism  gives  us  the  ability  to  talk  about  someone's  knowing  about  the  effects  of 
an  action.  In  the  modal  logic,  we  can  express  the  assertion  that  A|  knows  that  P  would 

result  from  A2  doing  Act  as  Know(Aj,R«c(Do(A2,Act)tP)).  The  possible-world  analysis  of  this 
statement  is  that  in  every  world  that  is  compatible  with  what  Aj  knows,  there  is  a  world 
which  is  the  result  of  A2  doing  Act  and  in  which  P  is  true  (see  figure  3.1).  Formally,  this  is 
expressed  by: 

Vwj  (K(:Aj,Wq,Wj)  a  3w2(R(:Do(:A2,:Act),Wj (Wj}  A  T(w2»P»), 

assuming  that  Aj,  A2,  and  Act  are  rigid  designators.  As  with  event  descriptions  and  events, 
we  distingush  between  terms  in  the  modal  logic  such  as  Aj,  A2,  and  Act  and  terms  in  the 
possible-world  notation  such  as  :Aj,  :A2,  and  (Act.  In  general,  terms  in  the  modal  logic  may 
correspond  to  different  terms  in  the  possible-world  notation  depending  on  what  possible 
world  they  are  evaluated  in.  This  is  discussed  in  more  detail  in  section  4.3. 
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Vw,  ( K{ : A i ,  W0,  w, )  3  3w2<R(:Do<:A2, 


:Act),  w1(  w2)  A  T(w2,  P))) 


McCarthy  and  Hayes  ran  into  difficulty  with  identifying  possible  worlds  with  situations 
because  they  wanted  to  express  knowledge  about  the  effects  of  actions  in  terms  of  knowing 
formulas  of  the  situation  calculus  (McCarthy,  1978).  This  requires  allowing  occurrences  of 
terms  that  denote  situations  inside  the  modal  operator  Know.  The  problem  is  how  to  relate 
these  terms  to  the  references  to  possible  worlds  that  are  introduced  when  an  occurrence  of 
Know  is  eliminated.  From  our  point  of  view,  this  problem  is  the  result  of  confusing  two 
different  levels  of  language.  In  the  modal  notation  we  do  not  have  terms  denoting 
situations.  At  this  level,  all  talk  about  the  effects  of  actions  is  in  terms  of  the  modal 
operators  Ros  and  Rest.  All  references  to  situation/worlds  are  introduced  in  transforming 
the  modal  notation  into  possible-world  notation.  Thus,  we  never  have  the  problem  of 
introducing  references  to  possible  worlds  in  the  analysis  of  a  formula  that  already  refers  to 
situations. 

In  addition  to  simple,  one-step  actions,  we  will  want  to  talk  about  complex  combinations 
of  actions.  To  facilitate  this,  we  will  introduce  sequences,  conditionals,  and  iterations.  If  P 


is  a  formula,  and  Actj  and  Act£  name  actions,  then  (Act||Act2).  li(P,Act|fAct2),  and 
While(P,Act])  also  name  actions.  The  result  of  A  doing  (Aetjj  Act2)  in  W|  will  be  W2  just  in 
case  there  is  some  situation  W3  such  that  W3  is  the  result  of  A  doing  Act)  in  Wj  and  W2  is 
the  result  of  his  doing  Act2  in  W3.  That  is,  doing  (Actj;  Acl2)  is  equivalent  to  doing  Act} 
and  then  doing  Act^  The  result  of  A  doing  M(P,Actj,Act2)  in  W|  will  be  W2  just  in  case  P 
is  true  in  W}  and  the  result  of  A  doing  Act}  in  W|  is  W2.  or  P  is  false  in  Wj  and  the  result 
of  A  doing  Act2  in  W|  is  W2.  This  means  that  doing  l((P,Act|,Act2)  is  equivalent  to  doing 
Act}  or  Act2  depending  on  P.  The  result  of  A  doing  While  (P,Actj)  in  W|  will  be  W2  just  in 
case  the  result  of  A  doing  lf(P, (Actj;  While(P,Ae1j»,Nil)  in  Wj  is  W2.  where  the  result  of  A 
doing  Nil  in  W|  is  Wj.  In  other  words,  doing  While(P,Aet| )  is  equivalent  to  doing  Act} 
followed  by  Whil«(P,Act| )  if  P  is  true,  otherwise  doing  nothing,  i.e.,  doing  Act)  as  long  as  P 
remains  true. 

The  choice  of  programming  language  constructs  for  sequences,  conditionals,  and 
iterations  is  more  than  coincidental.  If  the  references  to  agents  are  eliminated  and  possible 
situations/worlds  are  identified  with  machine  states,  then  these  rules  amount  to  a  partial 
specification  of  the  semantics  of  an  imperative  (Algol-like)  programming  language.  In  fact, 
this  approach  to  formalizing  the  semantics  of  complex  actions  is  based  on  some  ideas  of  V. 
R.  Pratt  and  this  author  for  formalizing  the  semantics  of  programs  in  modal'  logic  Pratt 
and  his  associates  have  used  this  approach  to  develop  a  powerful  formalism  for  talking 
about  the  semantics  of  programs  which  they  call  dynamic  logic  (Pratt,  1976),  (Harel,  Meyer, 
and  Pratt,  1977). 

To  continue  the  analogy  between  actions  and  programs,  we  can  think  of  the  distinction 
between  our  strong  and  weak  operators,  Ret  and  Retl  In  terms  of  the  program-verification 
notions  of  total  and  partial  correctness.  Since  Ret(Ev,P)  is  defined  to  be  true  just  in  case  P  is 


true  in  at  least  one  possible  outcome  of  Ev  and  we  are  assuming  determinism,  Ret  expresses 
both  termination  and  correctness.  On  the  other  hand,  since  Retl(Ev,P)  is  true  when  P  is 
true  in  every  possible  outcome  of  Ev,  Reel  should  be  satisfied  when,  due  to  nontermination, 
there  are  no  possible  outcomes.  This  corresponds  to  the  notion  of  partial  correctness  in  the 
semantics  of  programs. 

The  rules  that  we  have  given  so  far  are  not  sufficient  to  prove  that  there  is  no  situation 
which  is  the  result  of  a  nonterminating  iterative  action,  but  we  can  remedy  this  by  adding  a 
possible-world  version  of  Hoare’s  (1969)  rule  for  partial  correctness  of  While  loops.  In  our 
terms,  the  rule  is  that  if  Q  is  true  in  every  situation  that  might  result  from  A  doing  Act}  in  a 
situation  where  P  and  Q  are  true,  then  Q  is  true  and  P  is  false  in  every  situation  that  might 
result  from  A  doing  While(P^ctj)  in  a  situation  where  Q  is  true.  Intuitively,  WhiletP^clj) 
will  not  terminate  if  it  is  executed  in  a  situation  where  some  condition  Q  holds  that  always 
implies  P  and  is  never  changed  by  doing  Act].  What  we  want  to  show  is  that  in  such  a 
situation,  there  is  no  situation  that  is  the  result  of  carrying  out  Whil*(P,Actj).  This  follows 
immediately  from  the  rule  we  have  just  stated,  because  if  Q  is  invariant  with  respect  to  Actj 

then  Q  will  be  true  and  P  will  be  false  in  every  situation  that  might  result  from  doing 
Whil«(P,Act|),  but  since  Q  always  implies  P,  P  would  have  to  both  true  and  false  in  any  such 
situation.  Therefore,  no  such  situation  can  exist. 

Our  possible-world  semantics  for  actions  leads  us  to  say  that  a  primitive  action 
description  Act  denotes  some  primitive  action  in  each  possible  world.  (If  Act  is  a  rigid 
designator  it  will  name  the  same  action  in  every  possible  world.)  This  leads  us  to  ask  what 
the  denotation  of  complex  action  descriptions  such  as  (Act}|Act2),  H(P,Act},Act2),  and 
Whil«(P,Actj )  might  be.  The  most  natural  answer  consistent  with  the  treatment  of  primitive 
actions  is  that  the  denotation  of  a  complex  action  description  is  the  sequence  of  primitive 


actions  that  would  be  performed  in  carrying  out  the  complex  action.  Such  a  sequence  seems 
to  be  a  natural  interpretation  of  the  term  process  as  it  is  used  in  computer  science.  If  we 
regard  procedures  as  action  descriptions,  then  we  conclude  that  the  relation  between 
procedures  and  processes  is  that  in  a  given  environment,  a  procedure  denotes  the  process 
that  would  result  from  executing  the  procedure  in  that  environment.  Pratt  (1979)  has 
independently  proposed  a  similar  approach,  but  with  a  slight!/  different  notion  of  what  a 
process  is.  This  type  of  approach  seems  to  be  definitely  preferable  to  the  notion  that  a 
procedure  denotes  the  function  it  computes,  since  that  idea  ignores  questions  of  efficiency 
and  does  not  seem  to  handle  programs  such  as  operating  systems  which  do  not  in  any 
interesting  sense  compute  a  value. 

S.2  The  Dependence  of  Active  on  Knowledge 

As  we  pointed  out  in  section  1.1,  knowledge  and  action  interact  in  two  principal  ways. 
First,  knowledge  is  often  required  prior  to  taking  action,  and  second,  actions  can  change 
what  is  known.  In  the  first  area,  we  need  to  consider  knowledge  preconditions  as  well  as 
physical  preconditions  for  actions.  Our  main  thesis  is  that  almost  all  knowledge 
preconditions  for  actions  can  be  analyzed  as  a  matter  of  knowing  what  action  to  take. 
Recall  from  chapter  I  our  example  of  of  trying  to  open  a  locked  safe.  Why  is  it  that  for  an 
agent  to  achieve  this  goal  by  using  the  plan  "Dial  the  combination  of  the  safe,"  he  must 
know  the  combination?  The  reason  is  that  an  agent  could  know  that  dialing  the 
combination  of  the  safe  would  result  in  the  safe  being  open,  but  still  not  know  what  to  do, 
because  he  does  not  know  what  the  combination  of  the  safe  is.  A  similar  analysis  applies  to 
knowing  a  telephone  number  in  order  to  call  someone  on  the  telephone  or  knowing  a 
password  in  order  to  gain  access  to  a  computer  system. 

It  is  important  to  realize  that  even  mundane  actions  that  are  not  usually  thought  of  as 
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requiring  any  special  knowledge  are  no  different  from  the  examples  just  cited.  For  instance 
none  of  the  AI  problem  solving  systems  that  dealt  with  the  blocks  world  tried  to  take  into 
account  whether  the  robot  had  sufficient  knowledge  to  be  able  to  move  block  A  to  point  B. 
Yet  if  a  command  were  phrased  as  "Move  my  favorite  block  back  to  its  original  position," 
the  system  could  be  just  as  much  in  the  dark  as  with  “Dial  the  combination  of  the  safe."  If 
the  system  does  not  know  what  actions  satisfy  the  description,  it  will  not  be  able  to  carry  out 
the  command.  The  only  reason  that  that  the  question  of  knowledge  seems  more  salient  in 
the  case  of  dialing  combinations  and  telephone  numbers  is  that,  in  the  contexts  where  these 
goals  naturally  arise,  usually  there  is  no  presumption  that  the  agent  knows  what  action  fits 
the  description. 

An  important  consequence  of  this  view  is  that  the  specification  of  an  action  will  not 
need  to  include  anything  about  knowledge  preconditions.  These  will  always  be  supplied  by 
our  general  theory  of  using  actions  to  achieve  goals.  What  we  will  need  to  specify  are 
criteria  for  knowing  what  action  is  referred  to  by  an  action  description.  As  we  will  see, 
though,  this  can  often  be  done  implicitly. 

In  terms  of  our  possible-world  semantics  for  knowing,  the  usual  way  of  knowing  what 
entity  is  referred  to  by  a  description  B  is  by  having  some  description  C  that  is  a  rigid 
designator,  and  knowing  *hat  B  ■  C.  (Note  that  if  B  itself  is  a  rigid  designator,  it  can  be 
used  for  C.)  In  particular,  then,  knowing  what  action  is  referred  to  by  an  action  description 
means  having  a  rigid  designator  for  that  action.  But  if  this  is  all  the  knowledge  that  is 
required  for  carrying  out  the  action,  then  a  rigid  designator  for  an  action  must  be  an 
executable  description  of  the  action  in  the  same  sense  that  a  computer  program  is  an 
executable  description  of  a  computation  for  the  interpreter  of  the  language  in  which  the 
program  is  written.  Note  that  by  "executable  description",  we  mean  that  the  description  can 
be  executed  provided  the  physical  preconditions  of  the  action  are  satisfied.  This  is  true  of 


programs  as  well.  If  the  preconditions  of  a  program  are  not  satisfied,  it  may  not  be  possible 
to  execute  the  program  because  of  run-time  errors. 

Often  the  actions  we  want  to  talk  about  are  mundane  general  procedures  that  we  would 
be  willing  to  assume  that  everyone  knows  how  to  perform.  Dialing  a  telephone  number  or 
the  combination  of  a  safe  are  likely  examples.  In  many  of  these  cases,  assuming  an  agent 
knows  the  general  procedure,  if  he  knows  what  objects  the  procedure  is  to  be  applied  to 
then  he  knows  everything  that  is  relevant  to  the  problem.  In  such  cases  the  function  which 
represents  the  general  procedure  will  be  a  rigid  function  so  that  if  the  arguments  of  the 
function  are  rigid  designators,  the  term  consisting  of  the  function  applied  to  the  arguments 
will  be  a  rigid  designator.  Hence  knowing  what  objects  the  arguments  are  amounts  to 
knowing  what  action  the  term  refers  to.  We  will  treat  dialing  the  combination  of  a  safe  as 
this  type  of  procedure.  That  is.  we  assume  that  anyone  who  knows  what  combination  he  is 
to  dial  and  what  safe  he  is  to  dial  it  on  knows  what  action  he  is  to  perform. 

There  are  other  procedures  which  we  might  also  wish  to  assume  that  anyone  could 
perform,  but  which  cannot  be  represented  as  rigid  functions.  Consider  the  blocks  world 
action  Puton(B,C).  Even  though  we  would  not  want  to  question  anyone’s  ability  to  perform 
Puton  in  general,  knowing  what  objects  B  and  C  are  will  not  be  sufficient  to  perform 
Puton(B.C)  without  knowing  a thtrt  they  are.  We  could  have  a  special  axiom  stating  that 
knowing  what  action  Puton(B,C)  is  requires  knowing  where  B  and  C  are,  but  this  will  be 
unneccessary  if  we  simply  assume  that  everyone  knows  the  definition  of  Puton  in  terms  of 
more  primitive  actions.  If  we  define  Putonfoj^)  as  something  tike 

(Movohand(Location(xj ));  Grasp;  Movahandllocationffop^)));  Ungrasp) 

then  we  can  treat  Movohand,  Grasp,  and  Ungrasp  as  rigid  functions,  and  we  can  see  that 
executing  Puton  requires  knowing  the  location  of  the  two  objects  because  the  locations  are 
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mentioned  in  the  definition.  So,  although  Puton  itself  is  not  a  rigid  function,  we  can  avoid 
having  a  special  axiom  saying  what  the  knowledge  preconditions  of  Puton  are,  by  defining 
Puton  as  a  sequence  of  actions  which  are  represented  by  rigid  functions.  Of  course,  in  a 
practical  system  we  would  probably  want  to  "compile"  this  information,  rather  than  going 
back  to  the  definition  each  time  we  reason  about  Puton.  The  compiled  information  would 
be  in  the  form  of  theorems  that  can  be  derived  from  the  basic  axioms  of  the  system  and  the 
definition  of  Puton. 

In  the  preceding  discussion,  we  have  been  assuming  that  knowing  what  to  do  amounts 
to  knowing  what  single  specific  action  to  perform;  e.g.,  we  have  assumed  that  Dial(C),St}) 
names  only  one  action.  Obviously,  there  are  many  different  sequences  of  movements  that 
would  constitute  dialing  a  particular  combination  on  a  particular  safe,  so  we  really  ought  to 
regard  Dial(C|,Sf])  as  naming  a  class  of  actions  rather  than  a  single  action.  It  would  be 
completely  straightforward  to  modify  our  theory  to  take  this  into  account,  but  none  of  the 
interesting  problems  we  want  to  look  at  turn  on  this  point  Therefore  we  will  merely  note 
the  fact  and  let  it  pass.  It  should  be  realized,  though,  that  this  is  different  from  the 
distinction  between  individual  events  and  types  of  events  that  we  made  in  the  previous 
section.  Even  if  in  a  strict  sense  there  were  only  one  way  to  dial  the  combination  of  a  safe, 
so  DiaKCj'Sfj)  referred  to  a  definite  action,  performing  this  action  on  different  occasions 
would  still  constitute  different  individual  events. 

The  picture  we  have  presented  seems  to  be  an  adequate  account  of  knowing  how  to 
perform  the  sort  of  action  that  one  can  be  told  how  to  perform,  but  many  skills  do  not 
appear  to  fit  well  in  this  theory.  For  instance,  knowing  how  to  ride  a  bicycle,  play  the 
piano,  or  speak  a  language  does  not  seem  to  consist  entirely  in  being  in  possession  of  the 
right  factual  knowledge.  For  instance,  someone  could  be  an  expert  on  piano  playing, 
knowing  all  about  such  things  as  the  notation  of  piano  music  and  the  theory  of  technique 
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(e  g.,  what  fingerings  to  use  for  various  chords),  but  not  be  able  to  play  because  he  had 
never  practiced.  The  difference  between  such  a  person  and  a  concert  pianist,  though,  would 
not  be  a  matter  of  knowledge  (in  the  sense  we  have  been  using)  at  all.  It  seems  probable 
that  any  analysis  of  the  difference  would  have  to  be  at  the  level  of  physiology.  Therefore, 
in  our  theory,  whenever  we  consider  an  action  that  can  only  be  performed  by  someone 
possessing  a  specific  skill,  we  will  not  treat  the  skill  as  a  matte,  of  knowledge,  but  rather  as 
one  of  the  "physical"  preconditions  Tor  performing  the  action.  For  example,  in  the 
specification  of  the  action  Read  we  will  require  that  the  agent  of  the  action  satisfy  the 
condition  Reads  (i.e,  "is  able  to  read"). 

To  formalize  the  theory  we  have  developed,  we  will  introduce  a  new  modal  operator 
Can.  Can  (A, Act, P)  will  mean  that  A  can  achieve  P  by  performing  Act,  in  the  sense  that  A 
knows  how  to  achieve  P  by  performing  Act  This  notion  could  be  used  in  a  planning 
system  to  achieve  a  goal  P  by  finding  some  plan  Act  such  that  the  system  can  deduce 
Can(A,Act,P).  where  A  is  the  system's  name  for  itself.  We  will  not  give  a  possible-world 
semantics  for  Can  directly;  instead,  we  will  give  a  definition  of  Can  in  terms  of  Know  and  Ret. 
which  we  can  use  in  reasoning  about  Can  to  transform  a  problem  into  terms  of  possible 
worlds.  For  a  simple  action  that  cannot  be  decomposed  into  more  primitive  actions,  the 
definition  of  Can(A,Act,P)  will  be  that  A  knows  what  action  Act  describes,  and  he  knows  that 
his  performing  Act  will  result  in  P  being  true.  The  "will  result"  in  the  second  condition 
must  be  interpreted  in  the  strong  sense  that  it  is  possible  for  A  to  perform  Act  (i.e.,  Re«). 
This  forces  the  planner  to  check  that  the  preconditions  of  his  plan  are  fulfilled. 

This  definition  of  Can  is  adequate  for  simple  actions,  but  it  is  too  stringent  for  complex 
plans.  The  reason  is  that  it  requires  the  agent  to  know  ahead  of  time  exactly  what  he  is 
going  to  do.  In  a  complex  plan,  however,  he  may  take  some  action  that  results  in  his 
acquiring  knowledge  about  what  to  do  in  later  stages  of  the  plan.  All  that  is  required  when 
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he  starts  executing  the  plan  is  that  he  knows  what  to  do  first  and  he  knows  that  at  each 
subsequent  step  he  will  know  what  to  do  next  So,  the  definition  of  Can  for  sequences  of 
actions  is  that  A  can  achieve  P  by  doing  (Act|t  Act2)  just  in  case  by  doing  Act)  he  can  bring 
it  about  that  by  doing  Act2  he  can  achieve  P.  If  Act]  is  a  simple  action,  then  the  two  rules 
we  have  given  require  that  for  Can(A,(Act|{  Act2),P)  to  be  true,  A  must  know  what  action 
Actj  describes,  and  he  must  know  that  after  performing  Aetj  he  can  achieve  P  by 
performing  Act2. 

Finally,  we  will  define  Can  for  conditional  and  iterative  plans.  The  rule  for  conditionals 
is  that  Can(A,lf(PlActj,Act2),Q)  is  true  just  in  case  A  knows  that  P  is  true  and  A  can  achieve 
Q  by  doing  Act|,  or  A  knows  that  P  is  false  and  A  can  achieve  Q  by  doing  Act2-  In  other 
words,  for  an  agent  to  know  how  to  achieve  a  goal  using  a  conditional  plan,  he  must  know 
whether  the  condition  is  true  and  know  how  to  achieve  the  goal  using  the  appropriate 
branch  of  the  conditional.  The  rule  for  iterative  plans  is  quite  simple;  it  just  specifies  one 
level  of  expansion  of  the  loop.  The  rule  is  that  Can(A,While(PrActi),Q)  is  true  just  in  case 
Can(A,lf(P,(Actj;  Whil«(P,Aetj  )),Nil),Q)  is  true.  Since  Nil  is  a  primitive  action,  it  is  covered  by 

the  first  rule,  but  we  will  note  that  the  result  of  applying  this  rule  to  Nil  is  that  Can(A,Nil,P) 
is  true  just  in  case  A  knows  that  P  is  true.  This  may  seem  to  be  a  trivial  point,  but  it  is 
important  for  a  planner  to  realize  that  if  its  goal  is  already  true  then  it  doesn’t  have  to  do 
anything. 

S.S  The  Effects  of  Action  on  Knowledge 

In  reasoning  about  the  effects  of  an  action  on  the  knowledge  of  the  agent,  our  chief 
concern  will  be  whether  the  action  gives  the  agent  any  new  information.  Those  actions  that 
provide  the  agent  with  new  information  will  be  called  knowltdgt-produclng  actions.  We  will 


say  that  an  action  is  knowledge-producing  Just  in  case  after  performing  the  action  the  agent 
would  know  more  about  the  resulting  situation  than  he  did  before  performing  the  action. 
In  the  blocks  world,  looking  inside  a  box  could  be  a  knowledge-producing  action,  while 
moving  a  block  probably  would  not  Even  if  after  moving  the  block  the  agent  could  see 
what  configuration  the  blocks  are  in,  the  action  would  not  be  considered  a  knowledge- 
producing  action,  provided  the  agent  could  have  predicted  be. ore  hand  what  configuration 
would  result.  In  the  real  world  there  are  probably  no  actions  which  are  never  knowledge- 
producing,  because  all  physical  processes  are  subject  to  errors.  Nevertheless,  it  seems  clear 
that  we  do  and  should  treat  many  actions  as  not  being  knowledge-producing  to  simplify  the 
process  of  planning. 

Even  if  an  action  is  not  knowledge-producing  in  the  sense  that  we  have  Just  defined, 
performing  the  action  will  still  alter  the  state  of  knowledge  of  the  agent.  The  reason  for 
this  is  that,  assuming  the  agent  is  aware  of  his  action,  he  will  then  know  that  the  action  has 
been  performed.  As  a  result,  the  tense  and  modality  of  many  of  the  things  he  knows  will 
change.  For  example,  if  before  performing  the  action  he  knows  that  P  is  true,  then  after 
performing  the  action  he  will  know  that  P  was  true  before  he  performed  the  action. 
Similarly,  if  before  performing  the  action  he  knew  that  P  would  be  true  after  performing 
the  action,  then  after  performing  the  action  he  will  know  that  P  is  true. 

We  can  represent  this  very  elegantly  in  terms  of  possible  worlds.  Suppose  Act  describes 
an  action  which  is  not  knowledge-producing  and  A  names  an  agent.  Then  let  *A  be  the 
agent  described  by  A,  and  let  :Evj  be  the  event  described  by  Do(A,Act),  i.e,  the  event  which 
consists  in  :A  performing  the  action  described  by  Act.  Then  for  any  possible  worlds  Wj  and 
W2  such  that  W2  is  the  result  of  :Ev|  happening  in  Wj,  the  worlds  which  are  compatible 
with  what  :Aj  knows  in  W2  are  exactly  those  worlds  which  are  the  result  of  :Evt  happening 
in  some  world  which  is  compatible  with  what  sAj  knows  in  Wj.  This  tells  us  exactly  how 


what  sAj  knows  after  >Ev|  happens  (i.e.  after  tA  performs  the  action  described  by  Act)  Is 
related  to  what  iAj  knows  before  sEvj  happens. 

We  can  try  to  get  some  Insight  into  this  analysis  by  studying  figure  3.2.  Sequences  of 
possible  situations  connected  by  events  can  be  thought  of  as  possible  courses  of  events.  If 
Wj  is  an  actual  situation  and  tEvj  happens  producing  W2.  then  Wj  and  W2  form  a 
subsequence  of  the  actual  course  of  events.  Now  we  can  ask  what  other  courses  of  events 
are  compatible  with  what  sA  knows  in  W]  and  in  W2.  Suppose  W4  and  W3  are  connected 
by  :Evj  in  a  course  of  events  that  is  compatible  with  what  tA  knows  in  Wj.  Since  tEvj  is 
not  knowledge-producing  for  1A,  the  only  sense  in  which  his  knowledge  is  increased  by  tEvj 
is  that  he  knows  that  :Evj  has  happened.  Since  tEvj  happens  at  the  corresponding  place  in 
the  course  of  events  that  includes  W4  and  W3,  this  course  of  events  will  still  be  compatible 
with  every  thing  <A  knows  in  W2.  However,  the  appropriate  "tense  shift"  takes  place.  In 
Wj,  W4  is  a  possible  alternative  present  for  tA,  and  W3  is  a  possible  alternative  future.  In 
W2,  W3  is  a  possible  alternative  present  for  tA,  and  W4  is  a  possible  alternative  past. 

Next  consider  a  different  course  of  events  that  includes  W5  and  Wg  connected  by  a 
different  event  :Ev2.  This  course  of  events  might  be  compatible  with  what  :A  knows  in  Wj 
if  he  is  not  certain  what  he  will  do  next,  but  after  :Evj  has  happened  and  he  knows  that  it 
has  happened,  this  course  of  events  is  no  longer  compatible  with  what  he  knows.  Thus,  Wg 
is  not  compatible  with  what  :A  knows  in  W2.  We  can  see  then  that  even  actions  which 

provide  the  agent  no  new  information  from  the  outside  world  still  filter  out  for  him  those 
courses  of  events  where  he  might  perform  actions  other  than  those  which  he  actually 
performs. 

The  idea  of  a  filter  on  possible  courses  of  events  also  provides  a  good  picture  of 


Figure  3.2  The  effect  of  performing  an  action  that  is  not  knowledge-producing 
on  the  knowledge  of  the  agent. 


knowledge-producing  actions.  With  these  actions,  though,  the  filter  is  even  stronger,  since 
they  not  only  filter  out  courses  of  events  that  differ  from  the  actual  course  of  events  as  to 
what  happens,  but  they  also  filter  out  courses  of  events  which  are  incompatible  with  the 
Information  the  action  produces.  Suppose  Act  describes  a  knowledge-producing  action, 
where  the  knowledge  that  the  agent  gains  is  whether  P  is  true.  If  tEv  is  the  event  which 
consists  in  tA  performing  the  action  described  by  Act,  then  for  any  possible  worlds  Wj  and 
W2  such  that  W2  is  the  result  of  :Ev  happening  in  Wj,  the  worlds  which  are  compatible  with 
what  sAj  knows  in  W2  are  exactly  those  worlds  which  are  the  result  of  tEv  happening  in 
some  world  which  is  compatible  with  what  tAj  knows  in  Wj  and  in  which  P  has  the  same 
truth  value  as  in  W2.  It  is  this  final  condition  that  distinguishes  actions  that  are  knowledge- 
producing  from  those  that  are  not 
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Figure  3.3  The  effect  of  performing  a  knowledge-producing  action 
on  the  knowledge  of  the  agent. 

Figure  3.3  illustrates  this  analysis.  Suppose  Wj  and  W2  are  connected  by  tEv  and  are 
part  of  the  actual  course  of  events.  Suppose  further  that  P  is  true  in  W2.  Let  Wg  and  W3 
also  be  connected  by  tEv,  and  let  them  be  part  of  a  course  of  events  that  is  compatible  with 
what  tA  knows  in  Wj.  If  P  is  true  in  W3,  then  if  the  only  thing  <A  learns  about  the  world 
from  :Ev  (other  than  that  it  has  happened)  is  whether  P  is  true,  this  course  of  events  will 
still  be  compatible  with  what  1A  knows  after  tEv  happens.  That  is,  W3  will  be  compatible 
with  what  :A  knows  in  W2.  Suppose,  on  the  other  hand,  that  Wg  and  Wg  form  part  of  a 
similar  course  of  events,  except  that  P  is  false  in  Wg.  If  tA  does  not  know  in  Wj  whether  P 

would  be  true  after  tEv  happened,  then  this  course  of  events  will  be  compatible  with  what 
he  knows  in  Wj.  After  tEv  has  happened,  however,  he  will  know  that  P  is  true,  so  this 
course  of  events  will  no  longer  be  compatible  with  what  he  knows.  That  is,  Wg  will  not  be 
compatible  with  what  tA  knows  in  W2. 


One  major  advantage  of  this  approach  to  describing  how  an  action  affects  what  the 
agent  knows  is  that,  not  only  have  we  specified  what  he  learns  from  the  action,  but  also 
what  he  does  not  learn.  Our  analysis  gives  us  not  only  sufficient  conditions  for  inferring 
that  iA  knows  that  P  after  event  iEv,  but  also  necessary  conditions.  In  the  case  of  an  action 
which  is  not  knowledge-producing,  we  can  infer  that  unless  :A  knew  before  performing  the 
action  whether  P  would  be  true,  he  does  not  know  afterwards  either.  In  the  case  of  a 
knowledge-producing  action  where  what  is  learned  is  whether  Q  is  true,  he  will  not  know 
whether  P  is  true  unless  he  already  knows,  or  he  knows  how  P  depends  on  Q. 

This  possible-world  analysis  of  the  effects  of  action  on  knowledge  gives  us  everything 
we  need  to  formalize  the  notion  of  a  test  that  we  presented  in  section  l.l.  Recall  that  a  test 
was  defined  to  be  an  action  that  has  a  directly  observable  result  that  depends  conditionally 
on  an  unobservable  precondition.  In  the  terminology  of  this  chapter  we  would  say  that  a 
test  is  a  knowledge-producing  action  where  the  observable  result  is  part  of  the  information 
provided  by  the  action.  In  chapter  1  we  identified  three  conditions  on  an  action  being 
usable  as  a  test  for  P: 

(1)  The  agent  knows  that  Q  will  be  true  after  he  performs  the  action  just  in  case  P  is 
true  before  he  performs  the  action. 

(2)  After  the  agent  performs  the  action,  he  knows  that  he  has  just  performed  the 
action. 

(3)  After  the  agent  performs  the  action,  he  knows  whether  Q  is  true. 

Conditions  (2)  and  (3)  will  be  satisfied  if  Act  describes  a  knowledge-producing  action, 
where  the  knowledge  provided  includes  whether  Q  is  true.  So,  any  such  action  can  be  used 
as  a  test  for  P  just  in  case  (1)  is  also  satisfied.  Using  the  theory  that  we  have  presented,  we 
can  show  that  this  is  the  case,  as  illustrated  by  figure  3.4. 

Suppose  the  action  described  by  Act  satisfies  (I)  •  (3),  and  let  »Ev  be  the  event  which 


Figure  3.4  The  effect  of  a  test  on  the  knowledge  of  the  agent. 


consists  in  :A  performing  the  action.  Suppose  that  P  is  true  in  Wj,  but  «A  does  not  know 
whether  P  is  true.  Under  these  conditions,  there  will  be  at  least  one  situation  which  is 
compatible  with  what  he  knows  in  Wj  in  which  P  is  true  (W4)  and  at  least  one  such 
situation  in  which  P  is  false  (W5).  Since  sA  knows  how  the  truth  of  Q  after  :Ev  depends  on 
the  truth  of  P  before  :Ev,  if  W3  is  the  result  of  tEv  happening  in  W4.  then  Q  must  be  true  in 
W3.  Similarly,  if  Wg  is  the  result  of  sEv  happening  in  Wg,  then  Q  must  be  false  in  Wg.  Now, 
since  P  is  in  fact  true  in  Wj,  Q  must  be  true  in  W2.  By  condition  (3),  tA  knows  this  fact,  so 
W3  will  be  compatible  with  what  tA  knows  in  W2,  but  Wg  will  not  This  argument  shows 
that  after  sEv  actually  happens  no  possible  course  of  events  in  which  P  is  false  before  tEv 
happens  will  be  compatible  with  what  tA  knows.  Thus  we  conclude  that  after  tEv  happens, 
tA  will  know  that  P  was  true.  It  should  be  easy  to  see  that  if  we  assume  that  tA  knows 


whether  tEv  changes  the  truth  value  of  P  we  could  also  show  that  tA  knows  whether  P  is 
true  after  tEv  occurs.  An  exactly  parallel  argument  would  apply  if  P  were  false  in  Wj,  so  we 

can  see  that  our  theory  completely  captures  the  reasoning  about  tests  that  we  described  in 
chapter  I,  based  on  ihe  general  principles  that  govern  reasoning  about  knowledge  and 


action. 


4.  Formalizing  the  Possible-World  Semantics  for  Knowledge 

4.1  Object  Language  and  Meta-Language 

We  have  now  presented  all  the  basic  theory  that  we  need  to  construct  a  formalism  for 
reasoning  about  knowledge  and  action.  The  essence  of  our  approach  is  to  define  a  logical 
language  that  contains  modal  operators  for  stating  facts  about  knowledge  and  action,  specify 
a  possible-world  semantics  for  the  modal  operators,  and  formalize  that  semantics  in  an 
ordinary  first-order  theory  to  which  standard  automatic  deduction  techniques  can  be 
applied.  A  major  question  that  we  have  not  answered,  though,  is  what  formal  rote,  if  any, 
the  modal  language  will  play  in  this  formalism.  In  most,  and  perhaps  all,  cases  it  would  be 
possible  to  frame  the  problem  entirely  within  the  concepts  of  the  possible-world  semantics, 
by-passing  the  modal  operators  completely.  If  we  did  this,  the  modal  language  would  simply 
be  a  heuristic  device  for  us  to  use  in  formulating  problems  in  the  possible-world  formalism. 

Rather  than  follow  this  approach,  we  will  incorporate  the  modal  language  directly  into 
our  formalism.  We  will  do  this  by  encoding  expressions  of  the  modal  language  (which  we 
will  henceforth  call  the  object  language)  as  terms  in  a  first-order  language  that  talks  about 
possible  worlds  (which  we  will  call  the  meta-language).  Then  we  can  axiomatize  the 
interpretation  of  modal  expressions  in  terms  of  possible  worlds  using  the  relation  T  that  we 
introduced  in  chapter  2,  where  T(W,P)  is  a  meta-language  formula  which  means  that  the 
object-language  formula  P  is  true  in  the  world  W.  This  is  an  idea  adopted  from  McCarthy 
(1975). 

There  are  several  reasons  for  doing  things  in  this  way.  First  of  all,  even  where  the 
translation  from  modal  notation  to  possible-world  notation  is  quite  direct,  the  modal 
notation  is  much  more  concise.  Recall  from  section  3. I  the  example  of  saying  that  A|  knows 
that  A2  performing  Act  will  result  in  P  being  true.  In  the  modal  n 'nation  this  is  expressed 
simply  as  Know(At  jt««(Oo(A2Act)lP)).  In  the  possible-world  notation,  however,  it  becomes: 


Vwj(K(:A|,W0,Wj)  »  3w2(R(tDo(sA2,iAct),Wj ,w2>  A  T(w2,P))), 


As  a  language  for  problem  specification,  then,  the  modal  notation  is  clearly  preferable  to  the 
possible-world  notation. 

There  is  a  deeper  problem  than  this,  however,  which  seems  not  to  have  been  previously 
noted.  The  possible-world  framework  is,  in  a  sense,  conceptually  impoverished  compared  to 
the  modal  framework.  Even  if  we  can  represent  the  same  states  of  affairs  within  either 
framework,  it  does  not  follow  that  for  every  concept  we  can  express  in  the  modal  language 
there  will  be  a  corresponding  concept  in  the  possible-world  language.  This  seems  to  be  the 
case  with  the  modal  operator  Can.  Notice  that  while  there  is  a  simple  correspondence 
between  the  modal  operator  Know  and  the  accessibility  relation  K,  and  between  the  modal 
operators  Res  and  Real  and  the  accessibility  relation  R,  we  have  no  accessibility  relation  that 
corresponds  directly  to  Can.  Nevertheless,  any  formula  of  the  form  Can(A,Act,P)  has  a 
corresponding  formula  in  the  possible-world  language.  That  formula  can  be  obtained  by 
expanding  the  definition  of  Can  in  terms  of  Know  and  Ret  (see  section  5.2)  and  then 
transforming  the  occurrences  of  these  operators  into  their  possible-world  counterparts.  The 
problem  is  that  the  axioms  that  describe  this  transformation  appear  not  to  be  formulatable 
completely  within  the  possible-worlds  framework. 

It  is  important  to  remember  that  Can  is  not  simply  a  relation  among  an  agent,  an  action, 
and  a  goal;  it  is  a  relation  among  an  agent,  an  action  described  in  a  particular  way,  and  a 
goal  described  in  a  particular  way.  Furthermore,  if  the  action  is  a  described  by  a  complex 
sequence  of  subactions,  then  the  requirement  that  the  agent  know  what  subactions  are  being 
described  is  distributed  over  the  execution  of  the  whole  sequence.  The  agent  does  not  have 
to  know  exactly  what  action  a  particular  step  of  the  sequence  describes  until  he  actually  has 
to  do  it.  This  seems  to  require  that  Can  be  defined  recursively  over  action  dtseriptions.  In 
particular,  the  definition  of  Can  has  to  allow  for  the  fact  that  Can(A,Acl|tP)  can  be  true  and 


C«n(A,Act2,P)  can  be  false  even  If  Aelj  and  Ad2  both  describe  the  same  action.  We  can 
accomodate  this  in  the  modal  framework,  because  it  allows  such  intensiona!  constructions. 
The  possible-world  framework,  however,  is  extensional,  so  there  is  a  problem. 

This  sort  of  difficulty  does  not  arise  with  Know  because  there  is  no  need  for  a  recursive 
definition  to  capture  the  meaning  of  Know.  R#c(Do(AlAct),P)  and  R«l(Do(A,Act),P).  like  Can 
may  be  defined  recursively  over  Act  (indeed,  we  will  find  it  convenient  to  do  so),  but  the 
recursion  can  pushed  into  the  possible-world  framework,  because  Act  occupies  an 
extensional  position  in  these  formulas.  That  is,  Res(Do(A,Act|  ),P)  and  Res(Do(A,Act2)(P) 
must  have  the  same  truth  value  if  Act]  and  Act2  describe  the  same  action.  Therefore  we 
can  give  the  definitions  of  Rea  and  Real  as  simple  formulas  in  terms  of  the  whatever  event 
is  denoted  by  its  first  argument.  The  recursion  is  then  introduced  in  a  natural  way  in 
determining  what  action  is  denoted  by  a  complex  action  description.  The  problem  with  Can 
is  that  it  not  only  requires  a  recursive  definition,  but  the  argument  that  definition  recurs  on 
is  interpreted  intensionally.  There  may  be  some  way  to  make  such  a  definition  fit  naturally 
into  a  pure  possible-world  framework,  but  I  have  not  been  able  to  find  it 

The  result  of  these  considerations  is  that  although  any  particular  formula  invoving  Can 
will  have  a  possible-world  equivalent,  there  is  no  single  concept  in  the  possible-world 
framework  which  corresponds  directly  to  Can  in  the  way  that  the  accessibility  relations  K 
and  R  correspond  to  the  other  modal  operators.  We  can  draw  an  analogy  here  with  various 
levels  of  programming  languages.  We  know  that  in  theory  any  program  can  be  translated 
into  any  universal  basis  for  computation  no  matter  how  primitive,  for  example  Turing 
machines  or  combinatory  logic  The  constructs  that  are  used  in  one  system  may  have  no 
analogues  in  another  system,  however.  Program  variables  are  a  good  example  of  such  a 
construct.  Almost  every  practical  programming  language  has  some  notion  of  variable  in  it, 
yet  there  are  bases  for  computation,  like  combinatory  logic  (Curry  and  Feys,  1958),  in  which 
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there  are  no  variables.  So  even  though  any  program  in  a  language  with  variables  can  be 
translated  into  combinatory  logic,  the  variables  will  disappear  in  the  process,  the  same  way 
the  concept  Can  disappears  in  translating  from  the  modal  notation  to  the  possible-world 
notation. 

Thus  it  is  not  merely  clumsiness  of  syntax  that  leads  us  to  prefer  the  modal  language  for 
purposes  of  problem  specification,  as  one  might  be  led  to  preier  PASCAL  to  FORTRAN. 
It  is  a  fundamental  difference  in  conceptual  power,  like  the  difference  between  both  of  these 
languages  and  assembly  language. 

Civen  that  the  modal  language  in  some  ways  has  greater  conceptual  power  than  the 
possible-world  language  because  of  its  ability  to  express  intensional  concepts,  we  might  ask 
how  it  is  possible  to  axiomatize  the  interpretation  of  the  modal  language  in  an  extensional 
first-order  logic.  Again  the  analogy  with  programming  languages  is  helpful.  Even  though 
one  programming  language  may  be  more  powerful  conceptually  than  another,  it  is  always 
possible  to  write  an  interpreter  for  the  first  language  in  the  second.  Thus,  we  can  write  a 
LISP  interpreter  in  assembly  language,  even  though  LISP  has  recursion  and  assembly 
language  does  not.  We  can  do  this  because  the  interpreter  treats  LISP  programs  as  data. 
We  have  the  same  sort  of  situation  with  respect  to  logical  languages.  Our  extensional  meta¬ 
language  can  interpret  intensional  object-language  formulas,  because  those  formulas  are 
treated  as  logical  individuals  (i.e.,  data)  in  the  meta-language. 

4.2  A  First-Order  Treatment  of  the  Propositional  Logic  of  Knowledge 

We  will  develop  our  formalism  for  reasoning  about  knowledge  and  action  in  a  staged 
i athion  In  this  section  we  will  deal  with  the  propositional  logic  of  knowledge,  stating  the 
n^cntary  axioms  and  Illustrating  their  use  in  some  sample  deductions.  In  the  next  section, 
» *  wi>i  introduce  quantifiers  and  predicates,  and  in  chapter  5  we  will  extend  the  formalism 


^•ndie  our  integrated  theory  of  knowledge  and  action. 
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As  we  discussed  in  the  previous  section,  the  modal  object  language  will  be  encoded  as 
term  expressions  in  a  first-order  meta-language.  Typically  this  sort  of  thing  is  done  using 
string  operations  like  concatenation,  so  that  the  conjunction  of  P  and  Q  would  be 
represented  by  something  like  This  would  be  interpreted  as  the  string 

consisting  of  a  left  parenthesis  followed  by  P  followed  by  the  conjunction  symbol  followed 
by  Q  followed  by  a  right  parenthesis.  Thus  the  meta-language  expression  T|P|W|Qr)'  would 
denote  the  object-language  expression  (PaQ). 

There  is  a  much  more  elegant  way  to  do  the  encoding,  however,  which  is  due  to 
McCarthy  (1962).  For  purposes  of  semantic  interpretation  of  the  object  language,  which  is 
what  we  want  to  do,  the  details  of  the  syntax  of  that  language  are  largely  irrelevant  In 
particular,  the  only  thing  we  need  to  know  about  the  syntax  of  conjunctions  is  that  there  is 
some  way  of  taking  P  and  Q  and  producing  the  conjunction  of  P  and  Q.  We  can  represent 
this  by  having  a  function  And  such  that  And(P,Q)  denotes  the  conjunction  of  P  and  Q.  To 
use  McCarthy’s  term,  And(P,Q)  is  an  abstract  syntax  for  representing  the  conjunction  of  P 
and  Q.  We  will  represent  all  the  logical  operators  of  the  object  language  by  functions  in  an 
abstract  syntax. 

The  object  language  will  contain  the  usual  logical  operators  and  quantifiers  with 
equality,  and  the  modal  operators  Know,  R«s,  Reel,  and  Can  The  predicates,  functions,  and 
constants  will  vary  from  example  to  example,  but  will  include  the  function  Do  discussed  in 
chapter  3,  and  the  composition  functions  for  sequences,  conditionals,  and  iterations  of 
actions.  The  meta-language  will  include  all  the  well-formed  expressions  of  the  object 
language  as  terms,  the  truth  predicate  T,  the  accessibility  relations  R  and  K,  and  analogues  of 
all  the  nonintensional  constructs  of  the  ob Ject- language.  The  meta-tanguage  will  also 
contain  some  additional  constructs  which  will  be  introduced  later.  The  domain  of  discourse 
of  the  meta-language  includes  the  domain  of  discourse  of  the  object  language,  plus  object- 
language  expressions  and  possible  situation/worlds. 


80 


Since  the  axioms  of  our  formalism  will  be  introduced  gradually  over  this  chapter  and 
the  next  with  a  large  amount  of  intervening  material,  we  list  all  of  them  in  appendix  A, 
indicating  where  in  the  text  they  are  introduced.  The  axioms  that  specify  the  interpretation 
of  object-language  expressions  in  the  meta-language  constitute  a  recursive  definition  of  the 
truth  predicate  T.  Recall  that  T(W,P)  means  that  the  object  language  formula  denoted  by  P 
is  true  in  the  possible  world  denoted  by  W.  Often  we  will  want  to  say  that  a  formula  is 
simply  true,  i.e.,  true  in  the  actual  world.  We  will  therefore  introduce  a  special  constant 
symbol  W0  to  denote  the  actual  world  and  a  monadic  truth  predicate  True  to  mean  true  in 

the  actual  world.  True  is  defined  in  terms  of  T  and  Wq  by  the  following  axiom: 

LI .  Vp j  (True(p) )  ■  T(W0,pj )) 

In  the  our  formalism  (i.e.,  the  first-order  meta-language)  all  predicates,  functions,  and 
constants  will  begin  with  upper-case  letters,  while  variables  will  be  in  lower-case  letters.  We 
will  be  using  a  many-sorted  logic,  with  different  sorts  assigned  to  differents  sets  of  variables. 
For  instance,  the  variables  wj,  w2,...  will  range  over  possible  worlds,  the  variables  p|,  p2,_ 
will  range  over  object-language  formulas,  and  the  variables  «j,  «2,...  will  range  over  agents. 
For  the  sake  of  clarity,  we  once  again  note  that  object-language  expressions  are  not  formulas 
from  the  point  of  view  of  the  logic.  They  are  merely  terms  in  the  meta-language  which  we 
Interpret  as  representing  formulas  of  another  language  outside  the  formal  system. 

The  recursive  definition  cf  T  for  the  propositional  part  of  the  object  language  is  as 
follows: 

L2.  Vw,,pj,p2(T(wiAnd(p|,p2))  ■  (T(w|,p})  A  T(w1(p2))) 

?  "5  Vwi(p,,p2fr(Wj,0r{p|^2 ')  ■  (T(wltpj)  v  T<Wj,p2))> 

Vwj,pj,p2(T(W|,(p|  •>  p2l)  »  <T(wj,pj)  a  T(wj,p2))) 
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L5.  Vwj,pllp2(T(w1,(p|  <•>  P2))  ■  ■  T(wltp2))) 

L6.  Vw1,p1(T(w1,No»(p1))  ■  -T(w, lP| )) 

Axioms  LI  -  L6  just  translate  the  logical  connectives  from  the  object  language  to  the 
meta-language,  using  the  ordinary  Tarskian  definition  of  truth.  For  instance,  according  to 
L2,  And(P,Q)  is  true  in  a  world  if  and  only  if  P  is  true  in  the  world  and  Q  is  true  in  the 
world.  The  other  axioms  state  that  all  the  truth-functional  connectives  are  "transparent"  to 
T  in  exactly  the  same  way.  As  we  pointed  out  in  section  2.3,  the  major  advantage  of 
analyzing  the  object  language  in  this  way  is  that  by  lifting  the  logical  connectives  directly 
into  the  meta-language,  we  avoid  having  to  formalize  object-language  axioms  or  rules  of 
inference  for  them,  and  thereby  avoid  the  problem  of  such  axioms  and  rules  taking  control 
of  the  deductive  process. 

We  should  note  that  although  we  are  axiomatizing  truth  for  the  object  language,  the 
well-known  results  of  Tarski  (see  Rogers  (1971),  pp.  210  •  215)  on  the  impossibility  of 
axiomatizing  truth  do  not  apply  here.  What  Tarski  proved  was  that  it  is  impossible  for 
any  language  rich  enough  to  contain  arithmetic  to  consistently  axiomatize  its  own  truth 
conditions.  But  in  our  system  the  meta-language  axiomatizes  not  its  own  truth  conditions, 
but  rather  the  truth  conditions  of  the  object-language.  In  particular,  Tarski  showed  how  to 
construct  a  term  S  which  denotes  the  sentence  "$  is  not  true."  This  immediately  gives  rise  to 
the  classical  liar  paradox.  This  construction  cannot  be  carried  out  in  our  system,  because 
there  is  no  representation  of  the  predicates  T  or  True  in  the  object-language. 

To  get  the  propositional  logic  of  knowledge,  we  need  add  only  the  following  three 
axioms: 

Kl.  Vwj  ,pj  (T(wj  ,Know(trm.a j  ,pj ))  ■  Vw2(K(D(wj  ),Wj ,w2)  »  T(w2,pj))) 

K2.  V«j,Wj(K(«j,wjfwj)) 


K3.  V«j ,w j ,w2{K{j j ,wj ,w2)  »  Vw30C(jj,w2,w3)  a  Kfjj.Wj.Wj))) 

K 1  gives  the  possible-world  analysis  for  object-language  formulas  of  the  form  Know(A,P). 
The  interpretation  is  that  Know(A,P>  is  true  in  world  Wj  just  in  case  P  is  true  in  every  world 
which  is  compatible  with  what  the  agent  denoted  by  A  in  W|  knows  in  Wj.  Since  an  object 

language  term  may  denote  different  individuals  in  different  possible  worlds,  we  introduce 
the  function  D,  such  that  D(W,A)  is  a  meta-language  term  that  refers  to  the  individual 
denoted  by  the  object-language  term  A  in  world  W.  K  represents  the  accessibility  relation 
associated  with  Know,  so  K(0(W]^),W],W2)  is  how  we  represent  Wj  being  compatible  with 

what  the  agent  denoted  by  A  in  Wj  knows  in  Wj. 

As  we  pointed  out  in  section  2.3,  the  principle  embodied  in  K I  is  what  we  use  to  infer 
that  an  agent  knows  what  is  implied  by  his  knowledge.  Since  this  is  not  strictly  true,  in  a 
more  thorough  analysis  we  would  regard  the  inference  from  the  right  side  of  Kl  to  the  left 
side  as  being  a  plausible  implication.  K2  and  K3  state  constraints  on  the  accessibility 
relation  K  that  we  use  to  capture  other  properties  of  knowledge.  Together,  they  require  that 
for  a  fixed  agent  A,  K(A,wj,w2)  must  be  a  partial  ordering  on  possible  worlds.  We  have 

already  shown  in  section  2.3  that  this  entails  the  principles  that  anything  that  anyone  knows 
must  be  true,  and  that  if  someone  knows  something  he  knows  that  he  knows  it.  Below  we 
show  how  to  derive  these  principles  formally.  Finally,  the  fact  that  Kl  -  K3  are  asserted  to 
hold  for  all  possible  worlds  entails  that  everyone  knows  the  principles  they  embody,  and 
everyone  knows  that  everyone  knows,  etc  In  other  words,  these  principles  are  common 
knowledge. 

One  of  the  features  of  our  formalism  illustrated  by  K 1  is  the  method  of  writing  meta¬ 
language  variables  for  object-language  terms.  As  we  mentioned  above,  we  are  formalizing 
the  meta-language  as  a  many-sorted  first-order  logic  Since  the  domain  of  discourse  of  the 
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meta-language  includes  various  types  of  well-formed  object-language  expressions,  we  will 
need  meta-language  variables  to  range  over  these  expressions.  We  have  already  introduced 
the  variables  pj,  p2,...  which  range  over  object-language  formulas.  Now  we  introduce  the 
variables  trmj,  trm2,...  to  range  over  object-language  terms.  However,  we  also  want  to 

consider  the  object  language  to  be  a  many-sorted  logic,  so  we  will  need  meta-language 
variables  to  range  over  object-language  terms  of  a  particular  sort  Since  the  domain  of 
discourse  of  the  meta-language  includes  the  domain  of  discourse  of  the  object  language,  we 
will  construct  these  variables  so  that  if  C| .  s2,  ~  are  meta-language  variables  of  sort  t  and 
this  sort  is  also  in  the  domain  of  discourse  of  the  object  language,  then  trmj2t~  will 

be  meta-language  variables  that  range  over  object-language  terms  of  sort  s.  In  Kl,  for 
instance,  trm.sj  ranges  over  object-language  terms  which  refer  to  possible  agents.  Notice 

that  this  requires  us  to  allow  sorts  to  be  hierarchically  organized  since  the  range  of  trm.aj  is 
a  subset  of  the  range  of  trm|. 

To  illustrate  the  use  of  these  axioms,  we  will  show  how  to  derive  some  simple  results  in 
the  propositional  logic  of  knowledge.  To  simplify  the  formulas  in  these  examples,  we  will 
assume  that  the  object-language  term  A  is  a  rigid  designator  for  an  individual  who  is  also 
denoted  by  the  meta-language  term  :A.  This  allows  us  to  substitute  sA  for  the  more 
complicated  0(W,A).  Our  proofs  will  be  in  natural  deduction  form.  The  axioms  and 
preceding  lines  which  Justify  each  step  will  be  given  to  the  tight  of  the  step.  Subordinate 
proofs  will  be  indicated  by  indented  sect  ons,  and  Ass  will  mirk  the  assumptions  on  which 
these  subordinate  proofs  are  based.  Dis(n,m)  will  indicate  the  discharge  of  the  assumption 
on  line  n  with  respect  to  the  conclusion  >>n  line  m.  The  general  pattern  of  proofs  in  this 
system  will  be  to  assert  the  object-language  premises  of  the  problem,  transform  them  into 
their  meta-language  equivalents  using  axioms  LI-L6  and  Kl,  then  derive  the  meta-language 
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expression  of  the  conclusion  using  first-order  logic  and  purely  meta-language  axioms  such 
as  K2  and  K3,  and  finally  transform  the  conclusion  back  into  the  object  language,  again 
using  L1-L6  and  Kl.  Our  first  example  will  be  to  show  that  axiom  M4  in  the  modal  logic 
of  knowledge  follows  from  K I. 


Prove:  Truo(Know(A,(P  >>  Q))  ■>  (Know(A,P)  •>  Know(A,Q)» 


1. 

T(W0,Know(A,(P  ->  Q))> 

An 

2. 

K(:A,W0,W|)  »  T(wj,(P  ■>  Q)) 

Kl,l 

3. 

T(W0,Know(A,P» 

An 

4. 

K(:A,Wq,Wj)  »  T(w,,P) 

K1.3 

5. 

K(:A,W0,w,) 

An 

6. 

T(w,,(P  ->  0)) 

2,5 

7. 

T(W|,P)9Tfw|A) 

L4.S 

8. 

T(w,,P) 

4,5 

9. 

T(w,.Q) 

7,8 

10. 

K(:A,Wq,W|)  3  T(w,,Q) 

Dis(5,10) 

11. 

T(W0,Know(A,Q)) 

KI.10 

12. 

T(W0,Know(A,P))  a  T(W0,Know(A,Q» 

Dis(3,l  1) 

13. 

T(W0,(Know(A,P)  ■>  Know(A.Q)J) 

14,12 

1 4.  T(W0,Know(A,(P  •>  Q))>  a  T(W0,(Know(A,P)  ->  Know(A,Q))>  Dit(l  ,13) 

1 5.  T(W0,(Know(A,(P  ->  Q))  ->  (Know(A.P)  •>  Know(A.Q))))  L4,l  4 

1 6.  True(Know(A,(P  ->  Q))  ■>  (Know(A,P)  ->  Know(A,Q))>  11,15 


This  proof  is  completely  straight-forward.  Lines  I  -  4  assume  the  two  antecedent 
conditions  and  then  express  them  in  possible-worlds  notation.  Then  we  pick  wj  as  a  typical 
world  which  is  possible  according  to  what  A  knows.  In  lines  S  -  9,  we  do  the  inference  that 
we  want  to  attribute  to  A.  Since  this  inference  can  be  done  in  an  arbitrarily  chosen  member 
of  the  set  of  worlds  which  are  possible  for  A,  it  must  be  valid  in  all  of  them  (line  10).  From 
this  we  conclude  that  A  can  probably  do  the  Inference  also  (line  1 1)  We  then  discharge  our 
assumptions  and  express  the  result  in  the  same  form  as  M4. 

Another  interesting  example  is  the  inference  that  proved  to  be  so  troublesome  for  the 
data-base  approach  -  concluding  *Know(A,P)  from  Know(A,(P  »  Q))  and  'Know(A,Q); 


Given:  Truo(Know(A,(P  ■>  Q))) 
True(Not(Know(A,Q))) 

Prove:  True(Not(Know(A,P))) 

1.  True(Know(A,(P  ■>  Q))) 

2.  T(W0,Know(A,(P  ->  Q))) 

3.  K(:A,Wq,W|  )  »  T(w,t(P  ■>  Q)) 

4.  True(NoUKnow(A,Q)» 

5.  T(W0,Not(Know(A,Q))> 

6.  **T (W0,Know(A,Q)) 

7.  K(:A,W0,Wj) 

8.  *>T(Wj,Q) 

9.  T(W,,(P  ->  Q)) 

10.  T(Wj,P)  »  T(W|,Q) 

11. -T(W,,P) 

12.  «T(W0,Know(A,P)) 

13.  T(W0,No!(Know(A,P)> 

14.  True(Not(Know(A,P)) 


Given 
LI, I 

K 1 ,1 

Given 
LI, 4 

L6.5 
KI,6 
Kl,6 
3,7 
L4,9 
10,8 
K17.ll 
16,12 
LI, 13 


In  this  proof,  like  the  preceding  one,  most  of  the  steps  simply  translate  between  the 
modal  notation  and  the  possible-world  notation.  Lines  1  •  3  express  the  first  premise  in 
possible-world  notation.  Lines  4  -  8  do  the  same  for  the  second  premise.  The  key  step  is 
concluding  from  the  fact  that  A  doesn't  know  Q  (line  6),  that  there  is  a  world,  Wj,  which  is 
compatible  with  everything  that  A  knows  and  in  which  Q  false  (lines  6  and  8).  From  this 
and  the  fact  that  in  every  world  compatible  with  what  A  knows,  if  P  is  true  then  Q  is  true,  it 
follows  by  modus  tollens  that  P  must  be  false  in  Wj  (line  II).  Translating  back  into  modal 

notation,  we  get  that  A  doesn't  know  P. 

For  our  final  two  examples  in  this  section,  we  will  show  formally  that  K2  and  KS  entail 
M2  and  M3: 


Prove:  True(Know(A,P)  ■>  P) 

1.  T(W0,Know(A,P» 

2.  K(:A,W0,w,)»T(w,7) 

3.  K(:A,W0,WO) 

4.  T(W0,P) 
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5.  TCW0,Know(A,P»  =»  IWqJP) 

6.  T(W0,(Know(A,P)  »>  P» 

7.  Tru«(Know(A,P)  •>  P) 


D»t(l,4) 

L4,5 

LI* 


We  assume  that  A  knows  that  P  (line  1),  so  P  must  be  true  in  every  world  which  is 
compatible  with  what  A  knows  (line  2).  By  K2.  the  actual  world,  W0,  must  be  compatible 

with  what  A  knows  (line  3),  so  P  must  be  true  in  (line  4). 


Provet  True(Know(A,P)  ■>  Know(A,Know(A,P))) 


1. 

T(W0,Krvow(A,P)) 

An 

2. 

K(:A,Wq,W|)  3  T(wlfP) 

Kl,l 

3. 

K(:A,Wq,W|)  3  (K(:A,W|,w2)  a  K(iA,W0,w2)> 

K3 

4. 

K(:A,Wq,W|) 

An 

5. 

K(:A,W|,w2)  »  K(:A,Wq,w2) 

3,4 

6. 

K(:A,W|,w2) 

An 

7. 

K(»A,W0,w2) 

5,6 

8. 

T(w2,P) 

2,7 

9. 

K(:A,W|,w2)  3  T(w24») 

Di>(6^) 

10. 

T(w|,Know(A,P)) 

Kl,9 

11. 

K(:A,Wq,W|  )  a  T(w|,Know(A,P)) 

Oit(4,IO) 

12. 

T(W0,Know(A,Know(AcP))) 

Kl.il 

13.  T(W0,Know(A,P»  a  T{W0,Kr*ow(A,Know{A1P)» 

Dis(l,12) 

14.  T(WQ,(Know(A,P)  ■>  Know(A,Know(A,P)))) 

14,13 

15.  True(Know(A,P)  ■>  Know(A,Know(A,P))) 

11,14 

Again  we  assume  that  A  knows  that  P  (line  I),  and  again  we  conclude  tha*  P  is  true  in 
every  world  which  is  compatible  with  what  A  knows  (line  2).  We  let  wj  be  a  typical  world 
compatible  with  what  A  actually  knows  (line  4),  and  we  let  w2  be  a  typical  world  compatible 
with  what  A  knows  in  wj  (line  6).  By  K3,  w2  must  also  be  compatible  with  what  A  actually 
knows  (line  7),  so  P  must  be  true  in  w2  (line  8).  Therefore,  A  knows  that  P  in  wj  (line  10), 

and  A  knows  that  A  knows  that  P  in  the  actual  world  (line  12). 

Actually,  this  last  proof  contains  something  of  a  cheat  When  we  made  the  assumption 
that  A  was  a  rigid  desigator,  we  did  so  mainly  to  simplify  the  formulas  that  we  had  to  work 


with.  In  the  proofs  before  this  one,  nothing  depends  on  that  assumption.  In  those  proofs, 
everything  still  goes  through  if  D(Wfl,A)  is  used  instead  of  tA.  Here  that  is  not  the  case,  and 

the  proof  depends  on  A  being  a  rigid  designator.  The  formalism  that  we  are  developing  in 
this  chapter  is  correct,  though.  The  problem  is  with  the  propositional  logic  of  knowledge 
that  we  presented  back  in  section  2.1.  Recall  that  the  intent  of  MS  was  to  represent  the  fact 
that  if  someone  knows  something,  he  knows  that  he  knows  it  It  seems  very  natural  to 
assume  that  this  entitles  us  to  infer  Know(A,Know(A,P))  from  Know(A,P).  This  is  not  quite 
right,  however,  because  there  is  no  guarantee  that  the  person  who  is  described  by  A  will 
know  that  he  is  the  person  described  by  A. 

Suppose  the  richest  man  in  the  world  knows  that  he  has  less  than  ten  billion  dollars.  If 
we  apply  M3,  we  will  infer  that  the  richest  man  in  the  world  knows  that  the  richest  man  in 
the  world  knows  that  he  has  less  than  ten  billion  dollars.  This  might  not  be  true,  however, 
if  he  does  not  know  that  he  is  the  richest  man  in  the  world.  He  might  think  that  someone 
else  is  the  richest  man  in  the  world  and  has  more  than  ten  billion  dollars.  What  we  really 
want  to  infer  is  that  the  richest  man  in  the  world  knows  that  At  knows  that  he  has  less  than 
ten  billion  dollars.  This  is  in  fact  the  principle  that  is  captured  by  K3.  One  way  of  making 
the  M3  version  valid  is  to  restrict  the  term  that  denotes  the  knower  to  be  a  rigid  designator. 
That  way  we  can  be  sure  that  he  recognizes  it  as  a  description  of  himself.  That  is  what  we 
have  done  in  this  proof. 

In  this  section  we  have  seen  how  to  axiomatize  the  possible-world  semantics  of  the 
propositional  modal  logic  of  knowledge,  so  that:  its  inferences  can  be  captured  in  first  order 
logic.  In  the  next  section,  we  will  extend  this  approach  to  handle  the  quantified  logic  of 
knowledge. 
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4.3  Introducing  Quantifier*,  Predicates,  and  Equality 

We  will  represent  object-language  quantifier*  in  our  formalism  by  two  functions  in  the 
abstract  syntax,  Exist  and  All.  These  functions  will  take  two  arguments,  a  term  denoting  an 
object-language  variable  and  a  term  denoting  an  object-language  formula,  presumably 
containing  at  least  one  free  occurrence  of  the  variable.  The  terms  that  denote  object- 
language  variables  will  be  meta-language  constants  beginning  with  a  The  scheme  for 
indicating  what  sort  the  variable  belongs  to  will  be  the  same  as  in  the  meta-language,  but 
since  formally  these  symbols  are  constants  we  will  use  upper-case  rather  than  lower-case 
letters.  Thus,  corresponding  to  the  meta-language  variable  tj,  we  will  have  the  object- 
language  variable  7Sj.  Exist (TS|,P)  will  denote  the  object-language  formula  which  means 

there  is  an  individual  of  sort  S  such  that  the  open  formula  P  is  true  of  that  individual. 
Similarly.  AII(?S],P)  will  mean  that  P  is  true  of  every  individual  of  sort  S. 

Axiomatizing  the  interpretation  of  quantified  object-language  formulas  presents  some 
minor  technical  problems.  We  would  like  to  say  something  like  ExitHfSjiP)  is  true  in  W  just 

in  case  there  is  some  individual  such  that  the  open  formula  P  is  true  of  that  individual  in 
W.  We  don't  have  a  way  of  saying  that  an  open  formula  is  true  of  an  individual  in  a 
world,  however;  we  just  have  the  predicate  T  which  simply  says  that  a  formula  is  true  in  a 
world.  One  way  of  solving  the  problem  would  be  to  introduce  a  new  predicate,  or  perhaps 
redefine  T,  to  express  the  Tarskian  notion  of  satisfiability  rather  than  truth.  An  elegant 
way  to  do  this  is  to  borrow  the  computer  science  notion  of  a  closure  (Sussman  and  Steele, 
1975),  which  can  be  defined  as  an  ordered  pair  consisting  of  a  formula  containing  free 
variables  and  a  set  of  bindings  for  those  variables.  If  we  used  this  notion  we  would  talk 
about  closures  being  true  in  a  world  rather  than  formulas.  In  interpreting  a  closure, 
whenever  we  came  across  a  free  variable  we  would  use  the  binding  specified  by  the  closure 


to  interpret  the  variable.  A  closure  whose  body  was  Exitf(?Sj,P)  would  be  true  in  W  if  there 
is  some  individual  of  sort  S  such  that  the  closure  of  P  in  which  that  individual  is  bound  to 
TSj  is  true  in  W. 

While  this  approach  is  semantically  elegant,  it  is  syntactically  clumsy,  as  it  requires  a 
complicated  syntax  to  describe  closures,  and  even  purely  propositional  formulas  have  to  be 
represented  as  closures  with  empty  sets  of  bindings.  We  will  take  the  simpler  approach  of 
finding  substitutions  for  the  free  variables  of  a  formula  such  that  the  resulting  formula  has 
the  same  truth  value  in  every  world  as  the  equivalent  closure.  Using  this  approach  to 
interpret  quantified  formulas,  for  every  individual  that  satisfies  the  open  formula  P  we  need 
to  be  able  to  find  a  term  that  can  be  substituted  for  the  free  variable  to  make  the  resulting 
closed  formula  true.  In  an  extensional  object  language  any  term  that  denotes  the  individual 
would  do.  Since  our  object-language  is  intensional,  however,  we  have  to  take  into  account 
whether  the  term  that  we  substitute  will  be  evaluated  with  respect  to  a  different  possible 
world  where  it  might  denote  a  different  individual.  Therefore,  we  must  use  a  rigid 
designator  for  the  individual  in  question  to  insure  that  all  occurrences  of  the  subtituted 
expression  will  denote  the  intended  referent 

This  approach  is  semantically  unattractive  since  it  requires  us  to  assume  that  there  is  at 
least  one  rigid  designator  for  every  individual  in  the  domain  of  discourse.  Our  theory  does 
not  require  this,  and  we  even  pointed  out  in  section  2.5  that  not  requiring  all  individuals  to 
have  names  was  a  desirable  feature  of  the  theory  in  respect  to  interpreting  certain 
statements  about  knowing  what  something  is.  Therefore,  we  will  adopt  the  substitutional 
approach  for  its  syntactic  simplicity,  but  we  will  refrain  from  making  use  of  it  in  any  w;*y 
that  would  be  incompatible  with  the  closure  approach. 

In  order  to  formalize  the  substitutional  approach  to  the  interpretation  of  quantified 
ob ject-language  formulas,  we  need  to  be  able  to  construct  a  rigid  designator  in  the  object 
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language  for  any  arbitrary  individual.  Since  our  representation  of  the  object  language  is  in 
the  form  of  an  abstract  syntax,  we  can  simply  stipulate  that  there  is  a  function  •  such  that 
for  any  individual  in  the  domain  of  discourse  of  the  object  language,  if  tX  is  a  meta¬ 
language  term  refering  to  that  individual,  then  «(:X)  denotes  an  object-language  rigid 
designator  for  that  individual.  (We  can  read  A(:X)  as  "the  standard  name  of  iX".)  We  can 
now  state  our  interpretation  rules  for  object  language  quantifiers.  In  the  following  axiom 
schemas  P  may  be  any  object-language  formula,  ?S{  may  be  any  object-language  variable,  tj 
is  the  corresponding  meta-language  variable,  and  the  notation  P[Trm|/Trm2]  indicates  the 
expression  which  results  from  substituting  Trmj  for  every  free  occurrence  of  Trm2  in  P: 

L7.  Vwj  (T(wj  ,Exist(7Sj,P))  ■  3sj(T(w, .PlAfcjl/fSj]))) 

L8.  Vw,(T(w1,AII(r$j,P)>  «  Vjjfnwj  ,P[A(Sj)/?$j]))) 

L7  says  that  an  existentially  quantified  formula  Is  true  In  a  world  W  if  there  is  some 
individual  of  the  sort  indicated  by  the  bound  variable,  such  that  the  formula  which  results 
from  substituting  a  rigid  designator  for  that  individual  for  the  bound  variable  in  the  body 
of  the  formula  is  true  in  W.  L8  says  that  a  universally  quantified  formula  is  true  in  W  if 
every  individual  of  the  sort  indicated  by  the  bound  variable  is  such  that  the  formula  which 
results  from  substituting  a  rigid  designator  for  that  individual  for  the  bound  variable  in  the 
body  of  the  formula  is  true  in  W. 

Note  that  one  of  the  instances  of  L7  asserts  the  equivalence  of  the  analysis  of  "knowing 
who"  in  terms  of  quantifying-in  with  the  analysis  in  terms  of  rigid  designators: 

T(wj .Exist  (7X  |  ,Know(A,Eq(TXj  ,B))))  ■  3xj (T(wj,Know(A,Eq(a(Xj ),S))J) 

This  says  that  there  is  something  which  A  knows  to  be  B  (i.e.  A  knows  who  B  is),  just  in 
case  there  is  some  individual  xj  such  that  A  knows  the  proposition  which  asserts  the 
equality  of  a  rigid  designator  for  xj  and  B. 


Up  to  this  point  we  have  left  nonintensional  atomic  formulas  unanalyzed,  reasoning  in 
terms  of  meta-language  expressions  of  the  form  T(W,P).  In  order  to  analyze  object-language 
predicates  and  terms  we  will  need  a  few  new  tools.  We  have  already  introduced  one  of 
these  tools,  the  function  0  which  maps  a  possible  world  and  an  object-language  term  into 
the  denotation  of  that  term  in  that  world.  Using  this  function,  the  simplest  method  of 
interpreting  object-language  atomic  formulas  would  be  to  introduce,  for  each  n-ary  object 
language  predicate,  an  n*  1-ary  meta-language  predicate  that  took  as  its  arguments  the 
possible  world  in  which  the  object-language  formula  is  to  be  evaluated  and  the  denotations 
in  that  world  of  the  arguments  of  the  object-language  predicate.  If  we  did  this,  then 
T(W,P(A))  would  be  analyzed  as  something  like  :P(W,D(W,A)).  This  treatment  of  predicate,  is 
similar  to  that  used  by  McCarthy  (1963)  (McCarthy  and  Hayes.  1969)  in  his  situation 
calculus. 

This  approach,  however,  creates  problems  when  we  try  to  formalize  the  effects  of 
actions.  If  we  do  things  this  way,  when  we  axiomatize  a  particular  action  we  will  have  to 
say  explicitly  for  each  predicate,  function,  and  constant  how  the  action  changes  its  extension. 
That  is,  for  each  predicate  we  will  have  to  say  how  what  the  predicate  is  true  of  after  the 
action  is  performed  depends  on  what  is  true  before  the  action  is  performed.  Similarly,  for 
each  function  or  constant  we  must  say  how  the  action  might  change  the  referent  of  any 
terms  that  mention  that  function  or  constant  This  problem  was  first  pointed  out  by 
McCarthy  and  Hayes  (1969)  and  was  called  by  them  the  frame  problem.  We  will  call  axioms 
that  describe  the  effects  of  actions  frame  axioms. 

It  has  often  been  noted  that  most  actions  affect  relatively  few  aspects  of  a  situation,  so 
that  the  most  concise  formulation  of  the  frame  axioms  for  an  action  would  be  to  explicitly 
state  what  things  do  change  and  then  add  that  'everything  else  stays  the  same."  If  we 
follow  the  approach  of  translating  object-language  predicates  into  meta-language  predicates. 


^D-A126  244 
UNCLASSIFIED 


REASONING  ABOUT  KNOHLEDGE  AND  ACTION(U>  SRI 
INTERNATIONAL  HENLO  PARK  CA  ARTIFICIAL  INTELLIGENCE 
CENTER  R  C  NOORE  OCT  AS  SRI-TN-191 

F/G  6/4 


MICROCOPY  RESOLUTION  TEST  CHART 

NATI0NAL  BUREAU  OF  STANDARDS-  1963-A 


1 

I 


however,  we  cannot  express  the  notion  of  everything  else  staying  the  same,  because  our 
metalanguage  is  first-order,  and  we  would  have  to  quantify  over  those  predicates. 

So  that  we  can  state  frame  axioms  more  easily,  we  will  adopt  a  notation  where  the  meta¬ 
language  analogue  of  an  object-language  predicate  is  a  function  rather  than  a  predicate. 
We  will  analyze  T(W,P(A})  as  H(W,:P(0(W,A))).  This  formula  can  be  read  as  *»P  holds  in  W 
for  the  denotation  of  A  in  W."  The  difference  between  P  and  t.'  is  that  the  argument  of  P  is 
an  object-language  description  of  an  individual  which  has  to  be  interpreted  relative  to  a 
possible  world;  the  argument  of  iP  is  a  meta-language  description  of  the  individual  which  is 
independent  of  possible  worlds.  (This  is  similar  to  the  difference  between  Eval  and  Apply  in 
LISP.  When  a  function  is  Apply'ed  its  arguments  have  already  been  evaluated  with  respect 
to  the  relevant  environment.)  We  can  regard  tP  as  function  which  maps  an  individual  into 
an  intensional  object  which  may  or  may  not  hold  in  a  given  possible  world.  The 
interpretation  is  that  :P(tA)  holds  in  W,  just  in  case  tA  has  the  property  P  in  world  W.  H  is 
the  meta-language  predicate  which  expresses  this  relationship.  The  difference  between  H 
and  T  is  that  T(W,P(A))  means  that  the  object-language  formula  P(A)  is  true  in  the  world  W, 
while  H(W,:P(:A))  means  that  the  individual  :A  has  the  property  >P  in  W.  That  is,  the  latter 
expression  gives  the  semantic  interpretation  of  the  former. 

If  we  use  this  notation  the  nonintensional  atomic  formulas  of  the  object  language  are 
analyzed  in  the  meta-language  as  term  expressions  such  as  :P(:A)).  Since  these  are  terms,  we 
can  quantify  over  them  in  our  first-order  meta-language  and  express  the  "everything  else 
remains  the  same"  clause  in  our  frame  axioms.  This  will  be  explained  in  more  detail  in  the 
next  chapter.  The  basic  idea  here  was  first  suggested  by  Kowalski  (1974)  as  a  modification 
of  McCarthy’s  situation  calculus.  The  integration  into  the  possible-worlds  framework  is 
original. 

Now  we  can  state  the  interpretation  axioms  for  object-language  predicates.  These 
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these  tools,  the  function  0  which  maps  a  possible  world  and  an  object-language  term  into 
the  denotation  of  that  term  in  that  world.  Using  this  function,  the  simplest  method  of 
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that  describe  the  effects  of  actions  frame  axioms. 

It  has  often  been  noted  that  most  actions  affect  relatively  few  aspects  of  a  situation,  so 
that  the  most  concise  formulation  of  the  frame  axioms  for  an  action  would  be  to  explicitly 
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follow  the  approach  of  translating  object-language  predicates  into  meta-language  predicates. 


9$ 


axioms  and  the  ones  that  follow  formalize  the  convention  that  for  most  predicates, 
functions,  or  constants  in  the  object  language,  the  corresponding  construct  in  the  meta¬ 
language  uses  the  same  symbol  preceded  by  a  colon.  L9a  and  L9b  are  axiom  schemata, 
where  P  can  be  any  nonintensional  atomic  predicate  in  the  object  language. 

19a.  Vw | ,trm | ,...,trmn(T (w j ,P(tr m j ,...,lr mn))  ■  H(wj,:P(0(wj,trmj)^^0(wj,trmn)))) 
if  P  is  not  an  essential  property  of  the  things  it  is  true  of. 

19b.  Vw| ,lrm j ,...,trmn(T(wj ,P(trm j ,...,trmn))  ■  :P(D(wl,trmj),...,0(w1,trmn))) 
if  P  is  an  essential  property  of  the  things  it  is  true  of. 

L9a  and  L9b  both  say  that  an  atomic  formula  is  true  if  the  corresponding  relationship 
holds  among  the  referents  of  the  terms  in  the  formula.  The  distinction  between  an  essential 
and  a  non-essential  property  is  that  if  an  individual  (or  tuple  of  individuals)  has  an 
essential  property,  it  has  that  property  in  all  possible  worlds.  A  good  example  is  the  fact 
that  numbers  are  essentially  numbers.  That  is,  if  some  individual  is  a  number,  it  is  a 
number  regardless  of  what  possible  world  it  is  in.  It  would  make  no  sense  to  talk  about  a 
possible  world  where  the  number  five  were,  say,  a  chair.  So  the  translation  of  object- 
language  expressions  for  essential  properties  into  the  meta-language  requires  no  reference  to 
possible  worlds.  In  this  special  case,  the  meta-language  construct  corresponding  to  an 
object-language  predicate  is  in  fact  a  predicate.  This  is  the  reason  for  the  difference 
between  L9a  and  L9b. 

Whether  there  are  in  fact  such  things  as  essential  properties  is  a  very  controversial  issue 
in  philosophy.  The  idea  of  essential  properties  originated  with  Aristotle,  but  by  this 
century  the  idea  had  fallen  into  general  disrepute.  For  instance,  Quine  (1953)  bases  his 
attack  on  quantified  modal  logic  on  the  argument  that  such  logics  contain  sentences  whose 
interpretation  presupposes  the  existence  of  essential  properties,  which  Quine  takes  to  be  an 
incomprehensible  notion.  With  the  advances  in  formal  semantics  for  modal  logic,  however. 


essentialism  has  regained  a  certain  degree  of  respectability,  with  Krtpke  (1972)  arguing 
rather  persuasively  in  its  favor. 

Whether  or  not  we  take  essentialism  seriously  as  a  philosophical  doctrine,  it  will  be 
convenient  to  identify  those  properties  in  any  problem  domain  which  are  unchangeable. 
We  will  sometimes  treat  certain  predicates  as  being  essential  properties  even  if  they  are  not, 
when  we  are  interested  only  in  a  subset  of  possible  worlds  wi.ere  they  do  not  change.  In 
the  blocks  world,  we  will  consider  being  a  block  to  be  an  essential  property  of  all  blocks,  so 
long  as  we  don’t  consider  any  actions  which  change  blocks  into  non-blocks  or  vice-versa, 
and  we  are  willing  to  assume  that  the  robot  knows  of  all  blocks  that  they  are  blocks. 

This  last  point  is  particularly  important  If  one  of  the  knowers  we  are  considering  does 
not  know  that  some  object  exists,  then  that  object  cannot  have  any  essential  properties. 
This  is  because  the  object  would  have  to  have  those  properties  in  all  the  worlds  compatible 
with  what  that  knower  knows.  The  knower  then  would  know  that  the  object  had  these 
properties,  and  would,  therefore,  know  that  the  object  exists.  An  alternative  formulation  of 
the  notion  of  essential  properties  would  be  to  say  that  P  is  an  essential  property  of  A  if  A 
has  P  in  every  world  where  A  exists.  This  would  require  more  complicated  axioms, 
however,  and  would  not  help  in  any  of  the  examples  we  will  consider,  so  we  will  stay  with 
the  simpler  formulation. 

The  next  set  of  axioms  specifies  the  translation  of  object-language  terms  into  the  meta¬ 
language.  Lila  and  LI  lb  are  axiom  schemata  where  Cnst  may  be  any  object-language 
constant.  LI2a  and  L!2b  are  axiom  schemata  where  F  may  be  any  object  language 
function. 


L10.  Vw1,x|(D(w1,a(x1))  ■  xj) 

Llls.  Vwj(D(w|,Cnst)  •  V(wj,:Cn*0)  if  Cnst  is  not  a  rigid  designator. 
11  lb.  Vw}(0(w|,Cnst)  •  sCnst)  if  Cnst  is  a  rigid  designator. 


LI 2a.  Vwj ,trmj ,...ltrmn(0(wj /(trmj ,~,<rmn))  ■  V(wj ,sF  (D(wj ,trmj  W>{W| ,trmn)» 
if  F  is  not  a  rigid  function. 

LI 2b.  Ywj,trm|i^irmn(0(w|/>(irmj^tirmn))  ■  tF(D(w j ,trmj ),^D(w} ,trmn))) 
if  F  is  a  rigid  function. 

Since  e(x|)  is  a  rigid  designator  for  xj,  its  value  is  X|  in  every  possible  world.  An 

object-language  constant  which  is  not  a  rigid  designator  translates  into  an  intensional  object 
(like  the  intensional  objects  corresponding  to  predicates),  which  determines  an  individual  in 
each  possible  world.  The  function  V  maps  a  possible  world  and  one  of  these  intensional 
objects  into  the  corresponding  individual.  The  reason  for  interpreting  object-language 
constants  this  way,  as  in  the  case  of  object-language  predicates,  is  to  be  able  to  state  frame 
axioms  more  easily.  The  referent  of  a  rigid  designator  does  not  depend  on  which  possible 
world  it  is  evaluated  in,  so  its  translation  into  the  meta-language  is  simply  a  constant 
Similarly,  non-rigid  object-language  functions  translate  into  meta-language  functions  which 
map  tuples  of  individuals  into  intensional  objects.  Rigid  object-language  functions  translate 
into  meta-language  functions  from  tuples  of  individuals  to  individuals. 

The  final  logical  axiom  in  our  system  deals  with  equality: 

L13.  Vwjltrmj,trm2(T(wj,Eq(trmj,irm2»  *  (D(wj,trmj)  ■  Dtwj'trn^))) 

LIS  is  a  special  case  of  L9b,  where  the  meta-language  interpretation  of  the  object-language 
predicate  Eq  is  the  known  meta-language  predicate  ■,  rather  than  an  undefined  predicate 
:Eq.  Two  object-language  terms  are  equal  in  a  possible  world  if  they  name  the  same 
individual  in  that  possible  world.  Note  that  since  ■  does  not  depend  on  what  possible  world 
it  is  applied  in,  we  are  assuming  that  being  identical  to  oneself  is  an  essential  property  of 
every  individual. 

It  may  be  instructive  to  see  how  translation  into  the  meta-language  distinguishes 
quantifiers  which  have  different  scopes  with  respect  to  the  operator  Know.  In  our  current 


formalism,  the  examples  from  section  2.5  of  differing  quantifier  scopes  would  be  expressed 
as  follows: 

<1 )  True(Know(John,Exiit(?X|,And(Transistor{?Xj),Burned-out(?Xj ))))) 

(2)  True(Exist(7X  j  ,And(Tr«nsistor(7X  j  ),Know(John,Burned-out(7X| ))))) 

Recall  that  (1)  says  that  John  knows  there  is  a  burned  out  transistor,  while  (2)  says  that 
there  is  a  transistor  which  John  knows  is  burned  out 

Applying  the  axioms  we  have  just  given  will  produce  the  following  meta-language 
translations  for  these  two  formulas: 

(3)  Vwj  (K(:John,W0,W|)  a  3xj(H(wj,:Tr«r»i*lor(xj))  A  H(w|,:Burned-out(xj)))) 

(4)  3x|(H(W0,iTransistor(x}))  a  Vwj(K(:John,W0,Wj)  a  H(W|,:Burned~out(x|)))) 

(3)  says  that  in  every  world  which  is  compatible  with  what  John  knows  in  the  real  world, 
there  is  some  transistor  which  is  burned  out  (4)  says  that  there  is  some  particular  transistor 
which  is  burned  out  in  try  world  which  is  compatible  with  what  John  knows  in  the  real 
world. 

With  these  axioms  we  can  also  show  formally  how  the  fact  that  A  knows  that  P(C)  can 
be  derived  from  the  fact  that  A  knows  that  P(B)  and  A  knows  that  S  •  C. 


Given:  True(Know(A,P(B))) 

True(Know(A,Eq{B,C))) 


Prove:  True(Know(A,P(C))) 

1.  True(Know(A,P{Bm 

Given 

2.  T(W0,Know{A,P(B») 

LI, I 

3.  K(D(W0,A),W0,w,)  o  T(wj,P(B)) 

Kl,2 

4.  K(V(W0,:A),W0.w,)  =»  T(w,,P(B» 

LI  la, 3 

5.  True(Know(A,Eq(B,C))) 

Given 

6.  T(W0,Know(A,Eq(B,C») 

H, 5 

7.  K(D{W0 A),W0,w j )  o  T{W|,Eq(B,C)) 

Kl^ 
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S.  K(V(W0,:A),W0,w1)  a  T(w,,Eq(B,C)> 

Llla.7 

9.  K(V(W0,tA),W0,w,> 

Ass 

10.  T(w,,P(B)) 

4,9 

11.  H(w1,tP(0(W|,B))) 

L9«,10 

12.  Htwj.tPMw,,*))) 

Ula.ll 

13.  T(wj,Eq(B,C)) 

«,9 

14.  0(w,,B)  ■  D(wj,C) 

113,13 

15.  V(W|,:B)  ■  V(W|,tC) 

Lila.14 

16.  H(wi1tP(V(w1,tC))) 

12,15 

17.  H(w1,:P(D(w,1C))) 

LI  la, 16 

18.  T(wj,P(C)) 

L9a,17 

19.  K(V(W0,sA),W0,w,)  a  T(wltP(C)) 

Dis(9,I£) 

20.  K(D(W0,A),W0,wI)  a  T(w,,P(C)) 

LI  la, 19 

21.  T(W0,Know(A,P(C))) 

10,20 

22.  Trua(Know(A,P(C))) 

Li  ,21 

A  knows  that  P(B)  (line  I),  so  P(B)  is  true  in  every  world  compatible  with  what  A  knows 
(line  4).  Similarly,  since  A  knows  that  B  ■  C  (line  5),  B  ■  C  is  true  in  every  world  compatible 
with  what  A  knows  (line  8).  Let  wj  be  one  of  these  worlds  (line  9).  P(B)  and  B  ■  C  must  be 
true  in  wj  (lines  12  and  15).  hence  P(C)  must  be  true  in  wj  (line  16).  Therefore,  P(C)  is  true 
in  every  world  compatible  with  what  A  knows  (line  19),  so  A  knows  that  P(C)  (line  22).  If 
Tru«(Eq(B,C))  were  given  instead  of  True(Know(A,Eq(B,C))),  we  would  have  had  B  ■  C  true  in 
Wq  instead  of  wj.  In  that  case,  the  substitution  of  C  for  B  in  P(B)  (line  16)  would  not  have 

been  valid,  and  we  could  not  have  concluded  that  A  knows  that  P(C).  This  proof  seems 

long  because  we  have  made  each  routine  step  a  separate  line.  This  is  worth  doing  once  to 

* 

illustrate  all  the  formal  details,  but  in  subsequent  proofs,  we  will  combine  some  of  the 
routine  steps  to  shorten  the  length  of  the  derivation. 

Another  good  example  of  reasoning  about  equality  and  quantification  in  knowledge 


contexts  is  to  show  formally  that  if  A  knows  who  B  is  and  A  knows  who  C  is,  then  A  must 
know  whether  B  ■  C.  Recall  that  the  modal  formula  that  represents  A  knowing  who  B  is,  is 
3x(Know(A,(x  «  B))). 


Given:  True  (Exist  (TX  j  ,Know(A,Eq(?X  j  ,B)))) 

True  (Exist  (TX  j  ,Ki>ow(A,Eq(TX  j  ,C}))) 

Prove:  True(And((Eq(B,C)  •>  Kn©w(A,Eq(B,C))),(No»(Eq(8,C))  •>  Xnow(A,Not(Eq(B,C)))))) 

1 .  True(Exist(fX ,  ,Know(A,Eq(fX ,  ,B)))) 

Given 

2.  T(Wq, Exist  (?X ,  ,Know(A,Eq(?X ,  ,B)))) 

11,1 

3.  3x ,  (T (W0,Know(A,Eq(s(x  |  ),B)))) 

L7,2 

4.  T(W0,Know(A,Eq(«(:B'),B))) 

3 

5.  K(V(W0,:A),W0,w j  >  =  T(w1#Eq(fi(:B’),B)) 

Kl,Llla,4 

6.  True(Exist(fX  i  ,Know(A,Eq(TX  t  ,C)))> 

Given 

7.  K(Y(W0,:A),W0,wj)  a  T(wj,Eq(o(:C’),C)) 

LI^7,KllLllaj6 

Lines  1  -  7  translate  the  premises  from  the  object  language  into  the  meta-language,  letting 
:B'  be  the  individual  whom  A  knows  to  be  the  thing  referred  to  by  B,  and  letting  sC’  be  the 

individual  whom  A  knows  to  be  the  thing  referred  to  by  C.  Some  of  the  intermediate  steps 

have  been  suppressed. 

8.  T(W0.Eq(o(:B'),B)) 

K2,5 

9.  D(W0,o(:B’))  -  D(W0,B) 

LI  3,8 

10.  :B'  -  D(W0,B) 

LI0.9 

11.  :B'  ■  V(W0,:B) 

LI  la, 10 

12.  T(W0,Eq(o(:C,),C)) 

K2,7 

13.  :C'  -  V(W0,:C) 

L13.L10.L1  la, 12 

According  to  K2,  the  actual  world  must  be  compatible  with  what  A  knows,  so  B  must  denote 
:B*  in  W0  (line  1 1).  A  similar  argument  applies  to  C  (lines  12  and  IS). 

The  rest  of  the  proof  is  divided  into  two  cases;  we  show  that  if  B  ■  C,  then  A  knows  that 

B  •  C.  and  if  B  /  C,  then  A  knows  that  B  /  C. 

14.  T(W0,Eq(B,C» 

Asa 

15.  V(W0,:B)  ■  V(W0,:C) 

LI3,L1  le,14 

16.  :B'  •  tC’ 

1 1,13,15 

First  we  assume  that  B  ■  C  is  true  in  the  actual  world  (line  14). 

According  to  lines  1 1  and 

13,  this  means  that  :B'  and  sC’  must  be  the  same  individual  (line  16). 
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17. 

K(V(W0,jA),W0,w  j ) 

Acs 

18. 

T(w,,Eq(©(:B'),B)) 

5,17 

19. 

•B*  ■  V(wj,:B) 

113,110^.1  Is,  18 

20. 

T(w1,Eq(«(sC’),C)) 

7,17 

21. 

sC’«  V(w,,:C) 

LI3,L10,L1  Is, 20 

22. 

V(W|,:B)  •  V(wj,:C) 

16,19,21 

23. 

K(V(W0,:A),W0,w,)  a  (Vtwj.tB)  -  V(wit:C)) 

Dis(l7,22) 

24. 

T(W0,Know(A,Eq(B,C))) 

LI  la, LI 3, K1  ^13 

25.  T(W0,Eq(B,C)>  a  T(W0,Know(A,Eq(B,C))) 

Dit(l  4,24) 

26.  T(W0,(Eq(8,C) »  Know(A,Eq(B,C)))) 

L4.25 

We  let  W|  be  a  typical  world  which  is  compatible  with  what  A  knows  (line  17).  Therefore, 
it  must  be  true  in  wj  that  B  denotes  :B'  (line  19)  and  C  denotes  tC'  (line  21).  Since,  :B'  and 
iC'  are  the  same  individual,  B  and  C  have  the  same  denotation  in  wj  (line  22),  so  A  must 
know  that  B  ■  C  (line  24).  Discharging  the  assumption,  if  B  ■  C,  then  A  knows  that  B  ■  C 


(line  26). 

27. 

T(W0,Not(Eq(B,C))) 

Act 

28. 

V(W0,:B)  /  V(W0,sC) 

113,11  la,  14 

29. 

«B’  /  tC' 

11,13,28 

30. 

K(V(W0,:A),W0,w,) 

Ass 

31. 

T(wj,Eq(©(:B')lB)) 

5,30 

32. 

:B’»  V(w,,:B) 

,113,110,11  la, 31 

33. 

T(w,,Eq(B(:C'),C)) 

7,30 

34. 

:C'  -  V(w,,sC) 

L13,LI0,Llla,33 

35. 

V(wj,:B)  /  V(Wj,sC) 

29,32,34 

36. 

K(V(W0,sA),W0,w,)  a  (V(w,f;B)  V(w,,.-C)) 

Dis(30,35) 

37. 

T(W0tKnow(A,NoUEq{B,C)))) 

LUa,L13,l6,Kl,36 

3*.  T(W0,No»(Eq<B,C)))  a  T(W0,Know(A,Not(Eq(B,C))))  Di»(27,37) 

39.  T(W0,(No1(Eq(B,C»  •>  Know(A,Not(Eq(B,C)))))  L4.38 

40.  T(W0,And«Eq(B,C)  »>  Know(A,Eq(B.C))),  L2, 26,39 

(Not (Eq(B,C>) »  Know(A,Not(Eq(B,C)))))) 

41.  Tru«(And((£q(8,C)  •>  Know(A,Eq(B,C))),  LI, 40 

(Not(Eq(B,C»  ■>  Know(A,Nol(Eq(B,C)))))) 


II n  the  second  case,  we  assume  that  B  /  C  (line  27).  This  means  that  iB'  and  tC'  are  not  the 
4  same  individual  (line  29).  By  an  argument  completely  parallel  to  the  first  case,  we  conclude 
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that  if  B  /  C,  then  A  knows  that  B  /  C  (Une  39).  Combining;  the  two  cases  gives  the  desired 
final  result  (line  41). 


5.  A  First-Order  Theory  of  Knowledge  and  Action 

5.1  Formalizing  the  Possible-World  Semantics  for  Actions 

In  the  preceding  chapter  we  showed  how  to  formalize  in  first-order  logic  the  possible- 
world  semantics  for  knowledge.  In  this  chapter  we  will  extend  that  formalism  to  encompass 
our  integrated  theory  of  knowledge  and  action.  We  will  begin  by  presenting  a  first-orr 
treatment  of  the  possible-world  semantics  for  actions.  In  the  rest  of  the  chapter  we 
bring  in  the  ideas  about  the  interaction  of  knowledge  and  action  presented  in  chapter  3. 

In  chapter  3  we  introduced  the  object-language  modal  operator  Res  which  takes  ' 
arguments  a  description  of  an  event  and  a  formula.  The  interpretation  of  Res(Ev,P)  being 
true  in  W  was  that  it  is  possible  for  the  event  described  by  Ev  to  occur  in  W  and  if  it  did.  P 
would  be  true  in  the  resulting  situation.  By  assuming  that  all  events  are  deterministic,  we 
could  express  this  in  terms  of  possible  worlds  by  saying  that  there  is  some  world  which  is 
the  result  of  the  event  described  by  Ev  happening  in  W  and  in  which  P  is  true  In  our  first- 
order  formalism  this  is  represented  as  follows: 

Rl.  Vwj,trm.«vj,pj 

(T(wj  ,R«t(lrm.«V|  4>j ))  ■  3w2(R(D(w  j  ,»rm.«v j  ),w j  ,w2)  a  T(w2,p1))| 

The  only  new  notation  introduced  in  this  axiom  is  the  variable  trm.evj  which  ranges  over 
object-language  terms  that  denote  events. 

In  chapter  3  we  also  noted  that  the  events  that  we  are  interested  in  consist  of  agents 
performing  actions.  We  introduced  an  object-language  function  Do  such  that  Do(A,Act) 
names  the  event  in  which  the  agent  described  by  A  performs  the  action  described  by  Act 
We  decided  to  let  Do  be  a  rigid  function,  so  that  Do(A,Act)  will  be  a  rigid  designator  of  an 
event  if  A  is  a  rigid  designator  of  an  agent  and  Act  is  a  rigid  designator  of  an  action. 
Hence,  by  axiom  LI 2b,  D(W,Do(AlAct))  ■  :Do{D(W,A)lD(WfA«l)). 
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We  also  introduced  several  operators  to  construct  complex  actions  out  of  simpler  ones, ; 
for  sequences.  If  for  conditionals,  and  While  for  iterations.  In  chapter  $  we  informally 
described  how  a  possible-world  semantics  could  be  given  for  these  complex  actions  directly. 
Here,  however,  we  will  take  a  slightly  different  approach.  The  problem  is  that  to  apply  an 
axiom  like  R1  to  a  formula  containing  a  complex  action  description,  we  would  have  to 
axiomatize  what  these  action  descriptions  denote.  That  is,  we  vould  have  to  define  how  the 
function  D  behaves  with  respect  to  these  operators.  We  cannot  simply  apply  the  L12  axioms 
because  these  complex  action  descriptions  must  be  interpreted  intentionally.  In  particular,  if 
we  have  an  action  described  as  a  sequence,  any  expression  mentioned  in  a  step  of  the 
sequence  must  be  interpreted  relative  to  the  situation  in  which  that  step  of  the  sequence  is 
executed.  For  example,  if  we  execute  the  sequence  *(chop  down  the  tallest  tree;  chop  down 
the  tallest  tree)”,  the  trees  refered  to  by  the  two  instances  of  "the  tallest  tree*  will  be 
different.  The  same  sort  of  thing  occurs  in  the  interpretation  of  programming  language 
expressions,  e.g.,  (X  <-  X*l;  X  <-  X*1 ).  Here  the  interpretations  of  the  two  occurrences  of  X  on 
the  right  side  of  the  assignment  statements  will  be  different. 

So,  the  interpretation  of  complex  action  descriptions  will  not  be  trivial.  In  general,  the 
natural  thing  to  take  as  the  denotation  of  a  complex  action  description  in  a  situation  W 
would  seem  to  be  the  particular  sequence  of  simple  actions  that  would  result  from  executing 
the  complex  description  in  W.  Determining  what  that  sequence  is  could  require  a  complex 
series  of  deductions  and,  if  loops  are  involved,  it  may  be  undecidable.  Ml  we  really  want  to 
do  for  this  thesis,  however,  is  to  be  able  to  do  some  inferences  about  formulas  in  which  a 
complex  action  description  appears  as  an  argument  to  Rai,  Rest,  or  Can.  We  have  already 
argued  that  Can  has  to  be  defined  recursively  in  the  object-language.  If  we  look  back  at 
section  3.2,  we  see  that  defining  Can  this  way  did  not  require  talking  about  the  denotation  of 
complex  action  descriptions.  This  suggests  that  we  can  avoid  the  problem  altogether  by 


defining  Res  and  Rest  for  complex  actions  in  a  similar  way.  Of  course,  this  does  not  make 
the  theoretical  problem  go  away.  What  it  does  do  is  allow  us  to  confine  our  attention  to  the 
specific  problem  we  want  to  address,  deducing  formulas  containing  Ret  and  Ret!,  without 
having  to  deal  with  the  general  problem  of  what  sequence  of  simple  actions  is  denoted  by  a 
complex  action  description.  We  should  note,  though,  that  the  conceptual  framework  that  we 
have  developed  seems  to  be  adequate  for  attacking  this  harder  problem.  It  is  the 
procedural  difficulties  of  actually  getting  a  system  to  do  the  deductions  that  we  want  to  put 
off  for  further  research. 

Taking  these  considerations  into  account,  we  will  work  with  the  following  recursive 
definition  of  Res  for  sequences,  conditionals,  and  iterations: 

R2.  Vwj  ,trm.aj  .trm.act]  ,trm.act2,p] 

(T(w |  ,Ros(Do(trm.« j .(trm.act  ] ;  trm.act2)),P] ))  ■ 

T(w  j  ,Res  (Do (t rm.a  j  .trm.act  j  ),Res (Do (b{D(wj  ,trm.a  j  )).trm.act2),pj )))) 

R3.  Vwj ,trm.«j, trm.act j.trm.actj.pj ,p2 

(T  (w  j  ,Res(Oo(trm.a  j  ,lf  (p  j , trm.act  j  ,trm.act2)),p2))  ■ 

((T(w|,pj )  a  T(wj, Res{Do(trm.aj, trm.act j),p2)))  v 
(-T(wj,pj)  A  T  (w  j  ,Rcs  (Do  (t  rm.a  j  ,t  rm.act2  ),p2)}») 

R4.  Vw j  .trrn.aj , trm.act]  ,pj  ,p2 

(T(wj  ,Res(Do(trm.aj .Whilalpj  ,trm.act  j  )),p2))  ■ 

T  (w  j  ,Res  (Do  (t  rm.a  j  ,H(p  j  .(trm.act  j;  While  (pj  .trm.act  j  )),Nil)),p2 ))) 

R5.  Vtrm.«],wjtw2(R(Do(trm.a|,Nil),W|,w2)  ■  (wj  ■  w2)) 

R2  defines  Res  for  a  sequence  of  actions.  A  proposition  pj  is  true  in  the  situation 
resulting  from  the  agent  trm.aj  carrying  out  the  sequence  of  actions  (trm.actj;  trm.act2),  just 
in  case  pj  is  true  in  the  situation  resulting  from  the  agent  trm.aj  carrying  out  the  action 
trm.act2  in  the  situation  resulting  from  the  agent  trm.a]  carrying  out  the  the  action  trm.actj. 
The  agent  of  the  second  action  is  more  precisely  specified  by  a(D(w], trm.aj)).  This 
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expression  denotes  a  rigid  designator  for  the  referent  of  trm.*|  in  wj.  This  makes  sure 
that  the  agent  of  the  second  action  is  the  same  as  the  agent  of  the  first  action,  in  cases  the 
referent  of  trm.aj  is  changed  by  the  first  action. 

R3  defines  R««  for  a  conditional  action.  A  proposition  p2  true  in  the  situation 
resulting  from  the  agent  trm.«|  carrying  out  the  conditional  action  If  (pj ,trm.sct  j  ,lrmecl2), 
just  in  case  pj  is  true  and,  in  the  situation  resulting  from  the  agent  trm.«j  carrying  out  the 
action  Irm.actj,  p2  is  true,  or  p|  is  false  and,  in  the  situation  resulting  from  the  agent  trm.aj 
carrying  out  the  action  trm.act2,  p2  is  true. 

R4  defines  Res  for  an  iterated  action.  A  proposition  P2  is  true  in  the  situation  resulting 
from  the  agent  Irm.ij  carrying  out  the  iterated  action  While(pjltrm.act|),  just  in  case  P2  is 
true  in  the  situation  resulting  from  the  agent  carrying  out  the  action  lf(p|,(trin.actj} 
While (pj,trm.actj)),Nil).  That  is,  to  carry  out  While(pj,trm.actj),  an  agent  would  repeat  the 
action  trm.«ctj  as  long  as  pt  remained  true. 

R5  defines  the  execution  of  the  null  action  as  the  event  which  maps  every  situation  into 
itself.  That  is,  the  null  action  changes  nothing.  We  introduce  the  action  Nil  merely  to  fill 
out  the  unused  branch  of  conditionals,  as  in  R4. 

We  can  give  a  set  of  axioms  for  Rest  that  parallel  those  for  Res.  Recall  that  Real  is  a 
weaker  operator  than  Res  in  that  to  deduce  Res (Ev,P)  we  must  show  that  it  is  possible  for  Ev 
to  occur,  while  to  deduce  Rest  (Ev,P)  we  need  only  show  that  if  Ev  does  occur  F  will  be  true 
in  the  resulting  situation.  We  can  express  this  as  follows: 

R6.  Vwj.trm.evj.pj 

(T(wj .Rest (iriw.evj ,p j ))  ■  Vw2(R{0(w,,trm.ev,),Wj,w2)  a  T(w2,p,») 

Resi(Ev,P)  is  true  in  Wj  if,  assuming  W2  is  the  result  of  Ev  happening  in  W|,  P  is  true  in 


W2.  We  will  not  explicitly  go  through  the  axioms  for  sequences,  conditionals,  and  iterations 

for  Reel,  but  we  will  simply  note  that  they  would  be  identical  to  R2  •  R4,  with  RmI 
substituted  for  Rat. 

While  we  want  our  formalism  to  be  able  to  handle  the  problems  of  reasoning  about 
knowledge  and  action  in  as  general  a  way  as  possible,  there  will  obviously  have  to  be  special 
axioms  for  particular  actions.  After  all,  what  effect  an  action  has  on  the  world  is  a  question 
of  physics,  not  logic  Still,  our  goal  will  be  to  put  the  minimum  necessary  amount  of 
information  into  the  axioms  for  specific  actions  and  to  use  general  principles  as  much  as 
possible. 

To  illustrate  how  information  about  the  physical  effects  of  a  specific  action  would  be 
represented  in  our  system  we  will  work  out  an  example  from  the  blocks  world.  Suppose  we 
have  a  table  and  .a  number  of  blocks.  Any  number  of  blocks  can  be  on  the  table,  but  only 
one  block  can  be  on  a  given  block.  We  will  assume  there  is  one  agent  in  the  world,  called 
Hand,  which  can  move  a  block  using  the  action  Puton,  if  the  block  being  moved  has  nothing 
on  it  and  the  destination  is  either  the  table  or  another  block  with  nothing  on  it  We  will 
give  three  axioms  for  Puton: 

Pi  •  Vt|  ,X|  ,X2,W|  ,Wj 

Gw2(R(:Do(«| ,:Puton(x|  ,x2)),wj ,w2))  ■ 

<(:Block(xj )  a  Vx3(-.H(wj,:0n(*3,X|)))  a 

«xj  i  x2)  a  Vx3(-H(w j ,:On(x3,x2))))  v  iTablo(x2»)) 

P2.  Vaj,Xj,x2,Wj,w2 

(R(:Oo(a1,:Puton(xt^2)lW|,w2)  » 

(H(w2,:On(xj,x2))  A  Vx3((x2  /  x3)  »  'H(w2,tOn(xjpt3))))) 

P3.  Vaj,Xj,x2,Wj,w2 

(R(:Oo(a|,:Puton(x|^2),W|,w2)  » 

(Vint.trmj  (V(wj,in1.trm| )  ■  V(w2,int.trmj))  A 
Vint.p1(Vx3(ini.p{  t  :On(xltx3))  9  (Htwj.int.pj)  ■  H(w2^nt^j))))) 


106 


PI  gives  the  prerequisites  for  Puton.  It  is  possible  to  put  xj  on  x2  Just  in  case  X{  is  a 
block  with  nothing  on  it,  and  x2  is  either  a  different  block  with  nothing  on  it  or  is  the  table. 
Since  we  have  made  the  meta-language  predicates  tBIock  and  tTable  independent  of  any 
reference  to  possible  worlds,  we  are  treating  being  a  block  or  a  being  a  table  as  essential 
properties  of  blocks  and  tables.  In  the  blocks  world,  this  is  enforced  by  the  lack  of  any 
actions  which  transform  blocks  and  tables  into  anything  else. 

P2  and  P3  are  frame  axioms  which  describe  the  effect  of  of  Puton.  P2  says  that  in  the 
world  resulting  from  putting  xj  on  *1  I*  on  x 2  and  is  not  on  anything  else.  P3  describes 
what  Puton  does  not  change.  In  PS  int.trmi  is  a  variable  which  ranges  over  the  intensional 
objects  corresponding  to  object-language  terms,  and  int.p|  is  a  variable  which  ranges  over 

the  intensional  objects  corresponding  to  object-language  propositions.  What  PS  asserts, 
then,  is  that  Puton(x1(x2)  does  not  change  the  extension  of  any  of  the  basic  functions  or 
relations  of  the  language  except  what  x(  is  on. 

P3  illustrates  the  advantages  of  mapping  object-language  formulas  and  terms  into 
intensional  objects  in  the  meta-language  as  we  discussed  in  section  4.3.  To  make  effective 
use  of  this  axiom,  though,  we  will  have  to  have  some  knowledge  built  into  the  system  about 
what  expressions  for  intensional  objects  are  equal  to  each  other.  We  will  assume  that  two 
terms  which  denote  intensional  objects  and  begin  with  V  are  equal  only  If  they  are  the 
same  constant  or  are  the  same  function  with  equal  arguments.  So  tA  will  be  implicitly 
unequal  to  :B,  and  i0n(xj,x2)  will  be  equal  to  tOnfej^)  only  if  xj  is  equal  to  x2  and  x2  is 
equal  to  x4.  This  allows  us  to  use  axioms  like  P3  without  having  a  large  number  of 
inequality  axioms  for  intensional  objects. 

An  action  still  may  affect  a  great  many  relations  or  functions,  but  usually  we  can 
identify  a  relatively  small  number  in  terms  of  which  the  others  can  be  defined.  For 
instance.  Above  could  be  defined  in  terms  of  On: 


ABV1.  Vwj,trm.X|,trmj(2 

(T(wj  ,Above(trm.X|  .trmjij))  ■ 

(T (w  |  ,On(trm,x  j  .trm-Xj))  v 

3x3(T(Wj^bov«(»rmj(j,B(x3)»  A  T(w1#Above(a(x3),trm.x2))))) 

If  we  do  not  have  intensional  objects  corresponding  to  propositions  like  Above(A,B), 
then  we  will  be  forced  to  use  the  definition  ABVI,  and  we  will  not  be  in  danger  of  using 
PS  to  infer  that  Above(A,B)  is  still  true  after  we  move  B.  Most  AI  problem  solving  systems 
have  this  technique  embodied  in  their  programs.  What  we  have  done  here  is  to  express  it 
formally. 

By  asserting  that  PI  •  PS  apply  to  all  possible  worlds,  we  are  claiming  that  they  apply  in 
all  situations  in  the  actual  course  of  events;  l.e,  they  are  true  at  all  times.  Furthermore,  we 
are  claiming  that  all  agents  knowthat  PI  -  PS  apply  to  all  situations,  and  all  agents  know 
that  all  agents  know  that  they  apply  to  all  situations,  etc.  In  other  words,  the  facts  about 
Puton  are  assumed  to  be  common-knowledge. 

We  can  use  these  axioms  to  do  aprogram  verification"  for  the  blocks  world.  For 
example,  we  can  verify  the  solution  to  Sussman’s  (197$)  "anomalous  situation"  problem. 
Suppose  that  block  A  and  block  B  are  on  the  table,  block  C  is  the  only  thing  on  A,  and 
nothing  is  on  B  or  C.  We  can  achieve  A  on  B  and  B  on  C  by  putting  C  on  the  table, 
putting  B  on  C,  and  putting  A  on  B.  This  is  expressed  formally  by  the  following  deduction 
(See  figure  5.1  for  a  diagram  of  the  relevant  situations); 


Given:  True(Block(A)) 

True(Block(B)) 

True(BlocMC)) 

True(Table(Tbl» 

True(On(A,Tbl)) 

Truo(On(B,Tbl)) 

True(0n(C,A)) 

True(AII(X,(Not(Eq(X,C))  ■>  Not(On(X,A»» 
T  rue(Ail  (X  ,Not  (On(X,B )))) 

True  (All  (X, Not  (On  (X,C)))) 


Prove:  True(Res(Do(Hend,(Puton(C,TW)j  (Puton(B,C){  Puton(A,B)))),And(On(ArB)lOn(BlC)))) 
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1.  :Block(:A) 

2.  :Btock(:B) 

3.  :Block(:C) 

4.  :T«bl«(:Tbl) 

5.  H(W0,:0n(:A,:Tbl» 

6.  H(W0,:On(:B,:Tbl)) 

7.  H(W0,:0n(KJ,:A)) 

8.  Cxj  t  sC)  a  ~H(W0,t0n(x1,tA)} 

9.  <'H(W0,:0n(xj,:B)) 

10.  -H(W0,tOn(xi,iC)) 


Given, ll,L9b, LI  lb 
Giv*n,Ll,L9b,Ll  lb 
Given,Ll,L9b,Ll  lb 
Given, LI  ,L9b,Ll  1  b 
Giv«n,LI,L9e,Lllb 

Given, LU9e.Lt  lb 
Given4.U9e.Ll  lb 

Given, L 1  ,L8,L4,L6,L  1 3,L  1 0,L  1  lb, 19. 
Given, L 1  ,L8,L6,L9e,L  1 0,L  1  lb 
Given, LU>^4.9a4.10,LI  lb 


These  first  ten  lines  merely  translate  the  premises  of  the  problem  from  the  object  language 
to  the  meta-language.  We  are  assuming  that  A,  B,  C,  and  Tbl  are  the  standard  names  for 
the  objects  they  refer  to. 


11.  R(:Oo(:Hand,:Puton(:C,:Tbl)),W0,W1) 

3.10,44*1 

12.  (x,  /  :Tbl)  9  -H(Wj,:0n(:C,X])) 

114*2 

13.  Vx3(int.pi  f  K)n(^3»  9  (H(W04nip1>  e  HIWpinLp,)) 

114*3 

14.  -H(W,, :0n(«C, tA» 

12 

15.  -H(Wf  ,:0n{:C,:B)) 

12 

16.  -H(Wlt:0n(:C,:C)> 

12 

17.  xj  ■  :C 

Ast 

18.  -iKWpsOnfrpsA)) 

14,17 

19.  -H(W|,:0n(xj,:B)) 

15,17 

20.  -«H(W  j  ,:0n(x|  ,:C)) 

16,17 

21.  (x,  -  :C)  9  -H(W|,K)n(X|,:A)) 

Ois(l  7, 18) 

22.  <X]  .  :C)  9  -H|Wj,K)n(xj,:B)) 

Dis(  1 7,19) 

23.  (x,  ■  :C)  9  •>H(Wj,:0n(xj,:C)) 

Dis(l  7,20) 

24.  x,  t  :C 

An 

25.  'H{W0,:0n|x,,!A)) 

8,24 

26.  H(W0,:0n(x]pi2))  •  H(W j  ,:0n(K|  pt2» 

13,24 

27.  **H(W|,:0n(X|,:A)) 

25,26 

28.  -HIWpjOnfcj.sB)) 

9,26 

29.  ■<H(W|,:0n(x|,^)) 

10,26 

30.  (x,  J  :C)  9  ->H(W|,:0n(X|,:A)) 

Dis(24,27) 

31.  (xj  /  :C)  9  -H(W|,:0n(x|,:B)) 

Dis(24,28) 

32.  (xj  /  :C)  9  -H(W|,K>n(xj,:C)) 

Dit(24,29) 

33.  ’•H(W],:0n(x1,:A)) 

21,30 

34.  ^H(W],:0n(X],:B)) 

22,31 

35.  ^H(W|,:0n(x1,tC)) 

23,32 
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Lines  11-35  take  us  through  the  execution  of  the  first  step  of  the  plan.  Since  nothing 
is  on  C  and  Tbl  is  a  table,  there  is  a  possible  situation  which  is  the  outcome  of  putting  C  on 
Tbl.  We  call  this  situation  W|  (line  1 1).  In  Wj,  C  is  not  on  anything  other  than  Tbl  (line  12). 
and  everything  besides  C  is  where  it  was  in  the  original  situation,  Wq  (line  13).  We 
conclude  that  C  is  on  neither  A,  8,  or  C  in  Wj  (lines  14  •  15).  In  drawing  these  conclusions 

we  use  an  implicit  rule  that  two  standard  names  (e.g.  A  and  Tbl)  which  are  not  identical  do 
not  have  the  same  referent. 

We  then  do  some  reasoning  by  cases.  First  we  suppose  that  the  variable  xj  equals  C 
(line  17).  It  follows  immediately  that  xj  is  not  on  A,  B,  or  C  in  Wj  (lines  21  -  23).  Next  we 
assume  that  xj  is  not  equal  to  C  (line  24).  We  can  conclude  that  xj  is  not  on  A  in  Wq  (line 
25),  and  that  xj  is  the  same  place  in  Wj  as  in  Wq  (line  26).  This  means  that  X|  is  not  on  A. 
B,  or  C  in  Wj  (lines  30  -  32).  Therefore,  nothing  is  on  A,  8,  or  C  in  Wj  (lines  33  -  35). 


36.  R(:Do(:HarKjl:Puton(:B,:C)),W1,W2)  2,3,34,35 ,P1 

37.  H(W2,:0n(s8,sC))  36, P2 

38.  (x,  /  iC)  a  -H(W2,:0n(:B,x]))  36, P2 

39.  Vx3(int.pj  /  :On(:B^3))  a  (HfWjMpj)  ■  H(W2,int.pj ))  36.P3 

40.  -H(W2,:0n(:B,:A))  38 

41. "H(W2,:0n(:B,:B))  38 

42.  Xj  ■  :B  Ass 

43.  '•H(W2,:0n(X|,:A))  40,42 

44.  '•H(W2,:0n(x|,:B))  41,42 

45.  (xj  -  :B)  a  -H(W2,t0n(X|,:A))  Dis{42,43) 

46.  (x,  -  :B)  3  ■>H(W2,:On(x1,:B))  Dis(42,44) 

47.  X|  /  jB  Ass 

48.  H(W,,:On(x,,x2))«H(W2,:On(x1,x2))  39,47 

49.  "H(W2,:0n(x|,:A))  33,48 

50.  -H(W2,:0n(x,,:B»  34,48 

51.  (xj  f  :B)  3  -H(W2,:0n(x|,:A))  Dis(47,49) 

52.  (x,  /  :B)  3  -H(W2,K>n{X],:B))  Dis(47,50) 

53.  «H(W2,:0n(x|,:A))  45,51 

54.  -H(W2,jOn(xj,:B))  45,52 


Lines  36  •  54  describe  the  execution  of  the  second  step.  Since  nothing  is  on  B  or  C  in 
Wj,  it  is  possible  to  put  B  on  C  (line  36).  In  the  outcome  of  this  action,  W2,  B  is  on  only  C 
(lines  37  -  36),  and  everything  besides  B  is  where  it  was  in  W|  (line  39).  In  particular,  B  is 
not  on  A  or  B  (lines  40  -  41).  As  we  did  for  C  in  Wj,  we  reason  by  cases  that  nothing, 
whether  or  not  it  is  B,  it  on  A  or  B  in  W2  (lines  42  •  54). 


55.  R (:Do(:Hand,:Puton(: A,:B  )),W2,W3 ) 

56.  H(W3,:On(:A,:B)) 

57.  Vx3(int.p|  /  :On(:A,x3))  a  (H(W2,int.pj )  ■  H(W3,int.p|)) 

58.  H(W2,:0n(:B,x2))  •  H(W3,:0n(:Bpt2))) 

59.  H(W3,:0n(:B,:C)) 

60.  T(W3,And(0n(A,B),0n(B,C))) 

61 .  R(0(W2,Do(HandlPuton{A,B)))lW2,W3) 

62.  T(W2,Ros(Oo(Hand,Puton(A,B)),And(On(A,B),On(B,C)))) 

63.  R(0(W|  ,Do(Hand,Puton(BlC)}),W|  ,W2) 

64.  T(W2lR«s(Do(Hand,Puion(B,C)), 

Res(Do(Hand,Puton(A,B))lAnd(On(AlB),On(BlC))))) 

65.  T(W2,R«s(OoiHand,(Puton(B,C);  Puton(A.B))), 

And(0n(A,B),0n(B,C)))) 

66.  R(D(W0,Oo(Hand,Puton(C.Tbl))),W0,WJ ) 

67.  T(W0,Ras(Oo(Hand,Puton(C,Tbl)), 

Res(Do(Hand,(Puton(B,C);  Puton(AlB))),And(0n(A,B),0n(B,C))))) 

68.  T(W0,Ros(Do(Handl(Puion(ClTbl);  (Pulon(B,C)s  Puton(A,B))J), 

And(0n(A,B),0n(B,C)))) 

69.  True(Res(Do(Hand,(Puton(C,Tbl);  (Puton(B.C);  Puton(A,B)))), 

And(0n(A,B),0n(B,C)))) 


1,2,53,54^1 
55, P2 
55, P3 
57 

37,58 

56,59, L2,L9a, LI  lb 
55, LI  lb, LI 2b 
55,6 1^1 
36, LI  lb, LI 2b 
62.63.R1 

64,  R2 

1 1  ,L  1  lb,LI2b 
65,66, R1 

67.R2 

65. LI 


Now  we  consider  the  final  step.  Since  nothing  is  on  A  or  B  in  W2,  it  is  possible  to  put  A 
on  B,  bringing  about  W3  (line  55).  We  know  that  in  this  situation,  A  is  on  B  (line  56),  and 

everything  else  is  where  it  was  before  (line  57).  In  particular,  B  is  where  it  was  before  (line 
58),  on  C  (line  59).  At  this  point  we  are  essentially  done.  All  that  remains  is  to  translate 
our  results  back  into  the  object  language  (lines  60  •  69). 

This  example  shows  that  we  can  express  the  usual  A!  approach  to  actions  in  a  rigorous 


possible-world  formalism.  In  the  rest  of  this  chapter  we  will  show  how  to  formalize  the 
interactions  between  knowledge  and  action  within  the  same  framework. 


5.2  Formalizing  the  Dependence  of  Action  on  Knowledge 

In  section  3.2  we  discussed  the  ways  in  which  being  able  to  act  effectively  depends  on 
knowledge.  The  conclusion  we  reached  was  that  in  general  it  is  not  neccessary  to  regard 
particular  actions  as  having  knowledge  preconditions,  but  that  using  any  action  to  achieve  a 
goal  requires  knowing  what  action  to  take.  To  formalize  this  idea  we  introduced  the  modal 
operator  Can(A,Act,P)  to  mean  that  the  agent  denoted  by  A  can  use  the  action  described  by 
Act  to  achieve  P,  in  the  sense  that  A  knows  how  to  achieve  P  by  performing  Act.  In  section 
4.1  we  argued  that  the  most  natural  way  to  specify  Can  formally  is  by  a  recursive  definition 
in  terms  of  object-language  expressions.  That  definition  is  given  by  axioms  Cl  •  C4: 

Cl.  Vw|ltrm.a|ltrm.act|,P| 

(T  (w  |  .Know  (Irma  j  ,And(Eq(fi(D(w  j  ,trm.act  j  )},trm.act  j ), 

R«s(Do(a(D(wj  }),trm.act|  )fpj ))))  9 
T(wj  ,Can(lrm.a|  , trm.Kt  |  ,Pj ))) 

C2.  Vwj  ,trm.a j  .trm.Kt j  .trm.actpiPi 

(T(w|,Can(trm.a|l(trm.acl|;  trm.ic^l.Pi))  ■ 

T(wj  ,Can(trm.aj  ,trm.act|  ,Can(o(0(wj  ,trm.«j  )),lrm.K»2^j )))) 

C3.  Vwj ,lrm.a|,trm.actj .trro.Ktj.Pj ,Pg 

(T(w  j  ,Can(trm.aj  ,H(p  j  ,trm.act  j  ,trm.act2),P2))  ■ 

((T(w  j  ,Know(trm.«  j  ,p  j ))  A  T(wj  ,Can((rm.«j  ,trm.act|  V 

(T(W|,Know(irm.ij,Not(pj)))  A  KwpCandrm.aptrm.^^))))) 

C4.  Vw|  .trm.act]  ,Pj  ,P2 

(T (w  |  ,Can(trm.a  j  ,Whila(p  j , trm.Kt  j  ),p2))  ■ 

T (w j  ,Can(trm.a|  ,lf(pj , (trm.Kt j ;  Whila(p j  .tr.Ktj  »(Nil),P2»> 

Cl  says  that  the  agent  named  by  trm.aj  can  achieve  pj  by  doing  the  action  named  by 
trm.ac!j,  if  he  knows  what  action  trm.Ktj  names  and  knows  that  if  he  does  the  action 
named  by  trm.Kt],  P  will  result  The  first  of  these  conditions  is  expressed  by  saying  that 
there  is  a  rigid  designator  (an  executable  description)  for  an  action  which  the  agent  knows 
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5  describes  the  same  action  as  trm.act|.  C2  says  that  the  agent  named  by  can  achieve 

P]  by  doing  the  sequence  of  actions  (trm.*ct^;  trm.act2).  just  in  case  by  doing  trm.act|.  he  can 
achieve  a  state  where  by  doing  trm.act2,  he  can  achieve  P. 

Cl  and  C2  together  imply  that  in  performing  a  sequence  of  actions,  an  agent  is  not 
required  to  know  precisely  what  is  to  be  done  in  the  second  part  of  the  sequence  until  the 
first  part  has  been  carried  out  He  does  have  to  know  some  description  of  the  second  part 
of  the  sequence,  but  not  an  executable  description.  This  will  allow  for  sequences  of  actions 
in  which  the  early  stages  are  actions  that  gather  information  to  find  out  what  to  do  in  the 
later  stages.  In  both  these  axioms  trm.a]  is  converted  to  a  rigid  designator  to  guarantee  that 
the  agent  knows  that  he  is  the  one  who  is  able  to  achieve  the  result  described. 

C3  says  that  the  agent  trm.aj  can  achieve  P2  by  doing  I!(p|ltrm.act|ltrm.aet2),  just  in  case 
he  knows  that  pt  is  true  and  he  can  achieve  P2  ty  doing  hmactj,  or  he  knows  that  p|  is 
false  and  he  can  achieve  p2  by  doing  trm.act2.  C4  says  that  the  agent  Irm-aj  can  achieve  p2 
by  doing  Whil«(pj,trm.actj ),  just  in  case  he  can  achieve  p2  by  doing  trm.act|  as  long  as  p} 
remains  true. 

We  will  illustrate  the  use  of  the  operator  Can  with  the  sample  problems  from  chapter  I 
about  opening  safes.  First,  we  need  some  facts  about  dialing  combinations: 

01.  Vaj.xj^.w, 

Pw2(R(:Oo(a|  ,:Dial(x j  ,x2)),W|  ,w2))  ■ 

Pw3(xj  ■  V(w3,:Comb(x2))|  A  iSafe(x2)  A  H(w1(:At(ai,X2)))) 

02.  Vai,X|,x2,w,,W2 

R(:Do(a  j  ,:Dial(x|  ,x2)),w  |  ,w2)  a 
(((xj  ■  V(Wj,:Comb(x2)))  o  H(w2,:Op«n(x2)))  A 
(((xj  i  V(Wj,:Comb(x2)»  A  'H(W|,:Opan(x2))l  »  -H^.tOpenO^)))  A 
(H(wj,:Opan(x2)>  »  H^.tOpan^))))) 


We  will  let  the  action  Dial  refer  to  the  entire  sequence  of  turning  the  dial  of  the  safe  and 


\ 

then  attempting  to  turn  the  handle  and  open  the  safe.  D1  says  that  an  agent  can  dial  xj  on 
x 2  II  I*  possible  for  xj  to  be  the  combination  of  x2,  and  x2  is  a  safe,  and  the  agent  is  at 

the  same  place  as  the  safe.  D2  tells  how  dialing  a  combination  affects  whether  the  safe  is 
open:  if  the  combination  is  the  combination  of  the  safe,  then  the  safe  will  be  open;  if  it  is 
not  the  combination  of  the  safe  and  the  safe  was  locked,  the  safe  stays  locked;  if  the  safe 
was  already  open,  it  stays  open.  Notice  that  we  have  asserted  that  these  facts  are  true  in  all 
possible  worlds.  This  is  lets  us  infer  that  they  are  always  true,  everyone  knows  that  they  are 
always  true,  everyone  knows  that  everyone  knows  that  they  are  always  true,  etc 

Besides  the  axioms  for  Dial,  we  will  need  one  additional  fact  in  order  to  work  out  our 
examples:  • 

Al.  Vw|,ajlxj(H(wj,tAt(a},X|)}  a  Vw2(K(a|,wI,w2)  »  H(w2l:At(ajfXj)») 

A I  says  that  when  an  agent  is  at  the  same  place  as' some  object,  he  knows  that  he  is  at 
the  same  place  as  the  object.  This  is  not  really  true,  of  course,  the  object  may  be  hidden  so 
that  the  person  doesn't  know  that  it  is  there.  We  justify  our  use  of  A I  in  our  examples  by 
the  observation  that  if  a  person  were  asked  under  what  conditions  it  is  possible  to  open  a 
safe,  he  probably  would  not  consider  the  possibility  that  the  agent  might  be  at  the  location 
of  the  safe  and  not  know  it.  Actually,  dealing  with  all  the  unlikely  ways  in  which  a  plan 
might  fail  (dubbed  the  qualification  problem  by  McCarthy  (1977))  is  a  very  serious  problem 
in  A I  for  which  no  one  seems  to  have  a  good  solution,  and  it  is  beyond  the  scope  of  this 
thesis  to  find  one.  1 

Our  main  example  to  illustrate  the  use  of  Can  is  to  show  that  if  John  knows  the 
combination  to  the  safe  Sfj.  and  he  is  in  the  same  place  as  Sfj,  then  he  can  open  the  safe 
by  dialing  the  combination.  The  interesting  point  is  that  knowing  the  combination  of  the 
safe  comes  in,  not  as  a  specific  precondition  of  the  action,  but  as  a  way  of  satisfying  the 


general  conditions  on  Can.  The  possible-world  structure  for  this  proof  is  pictured  in  figure 
5.2. 


Given:  True(Safe(Sf  j)) 

True(At(John,Sf  j )) 

True(Exist(7X  j  .Know ( John, Eq(TX  j  ,Comb(Sf  j ))))) 


Prove:  True(Can(John,Dial(Comb(S<]  ),Sf j  ),Open(S(j ))) 

1.  :Safe(Sf  j ) 

Given,Ll,L9b,Ll  lb 

2.  H(W0,:Ai(:John,:Sf j )) 

Given, 11  ,L9a,Ll  lb 

3.  3x,  {T(W0,Know(John,Eq(e(x,  >,Comb(Sf|»») 

Given, L1,L7 

4.  T  (Wq, Know  (John, Eq(fi(:C),Comb(SI  j ))» 

3 

5.  K(D(W0,John),W0,w,)  a  T(w,,Eq(o(:C),Comb(Sfj») 

4,KI 

6.  K(:John,W0,w,)  a  (D(w,,B(:C))  •  0(w,,Comb(Sf])) 

5,11  lb, 113 

7.  K(:John,W0,wj)  a  (;C  ■  V(w,,:Comb(:Sfi») 

6,L10,LI2a,Ll  lb 

8.  K(:John,W0,W|)  a  H{w j ,:At (:John,:S(j )) 

2,A1 

9.  :C  «  V(W0,:Comb(:Sfj)) 

7,K2 

10.  K(:John,W0,w, ) 

Ass 

11.  :C  ■  V(w|,:Comb(:Sf|)) 

7,10 

1 2.  V (W0,:Comb(:Sf  j ))  «  V(w,  ,:Comb(:S<, )) 

9,11 

1 3.  :Dial(V(W0,:Comb(:Sf ,  )),:$!, )  •  :0iai(V(w,  ,:Comb(:Sf ,  )),:Sf  j ) 

12 

1 4.  D(W0,Dial(Comb{Sf ,  ),Sf , ))  -  D(w,  ,Dial(Comb(Sf ,  ),S1 1 » 

13,L1  lb,L12e 

1 5.  D(w,  ,B(D(W0,Di.l(Comb(Sf ,  ),Sf, ))))  •  D(w,  ,Dial(Comb(S(j  ),Sf , ))  1 4.L10 

1 6.  T(w  j  ,Eq(o(D(W0,Oial(Comb(S(j  ),Sf  j  )))tOial(Comb(S«i  },Sf| ))) 

15,113 

17.  H(w|,:At(:John,:Sf|)) 

8,10 

1 8.  R(:0o(:John,:0ial(V(w|  ,:Comb(:S(  j  )),:Sf  j  »,wj  ,W2> 

1.17,01 

19.  H(W2,:0pen(:Sf,)) 

18,02 

20.  T(W2.0pen(Sf1)) 

19,11  lb, L9a 

2 1 .  R(:Do{D(W0,John),:Dial(V(wj  ,:Comb(:Sf  j  )),:Sf  j  )),w,  ,W2) 

18, LI  lb 

22.  R(:Do(0(w,  ,o{0(W0,John))),:Oial(V(w,  ,:Comb(:S(j  )),:S1]  )).w,  ,W2)21,L10 

23.  T(w,  ,Res(Oo(o{D(W0,John)),Oiil(Comb(S»,  ),Sf ,  )),0pen(Sf  j )» 

22, LI  lb,L12a,12b,Rl 

24.  T(w,  ,And(Eq(e(D(W0,Di«l(Comb(Sfj  ),SfI  ))),Dial(Comb(Sf,  ),S»j  », 

1 6,23,L2 

Res(Do(fi(D(W0,John)),Dial(Comb(Sf  j  ),Sf,  »,0pen(SI, )))) 

25.  K(:John,WQ,Wj )  a 

Ois(l  0,24) 

T(w,  ,And(Eq(s(D(W0,Oial(Comb(Sf]  >,Sf ,  ))),Dia)(Comb(Sf  { ),$(, )), 
Res(0o(e(0(W0,John)),0ial{Comb(Sf  j  >,Sf ,  )),0pon(Sf  j )))) 

26.  T(W/j, Know! John, 

25, LI  lb, K1 

And(Eq(o(D(W0,Dial(Comb(Sf ,  ),Sf,  ))),Oial(Comb(Sf,  ),$(,)), 
Res{0o(fi(0{W0,John)},0ial(Comb(Sfi  ),Sf,  )),0pen(Sf  j ))))) 

27.  T(W0,Can(John,Dial(Comb(S( { ),S( { },0pen(S<  j ))) 

26, Cl 

28.  T rue(Can(John,Oial(Comb(S(  j  ),Sf  j  ),0pen(S» t ))) 

27, LI 

Line  I  translates  into  the  meta-language  the  premise  that  Slj  is  a  safe,  and  line  2 

translates  the  premise  that  John  is  at  the  same  place  as  the  safe.  The  third  premise,  that 
John  knows  the  combination  to  the  safe,  is  handled  by  lines  i  -  7.  We  have  stretched  out 
the  translation  of  this  premise  into  the  meta-language  to  expose  the  details  of  how  the 
existential  quantifier  is  handled.  There  is  something  which  John  knows  to  be  the 
combination  of  the  safe,  and  we  choose  to  call  that  thing  tC  (line  4).  Since  John  is  at  the 
same  place  as  the  safe,  we  conclude  that  he  knows  he  is  at  the  same  place  as  the  safe  (line 
8).  Since  John  knows  :C  to  be  the  combination  of  the  safe,  sC  must,  in  fact,  be  the 
combination  to  the  safe  (line  9). 

To  make  deductions  about  what  John  knows,  we  let  wj  be  a  typical  world  which  is 
possible  according  to  what  John  knows  (line  10).  Since  John  knows  that  tC  is  the 
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combination  of  the  safe,  sC  is  the  combination  of  the  safe  in  wj  (line  1 1).  Therefore,  the 
combination  of  the  safe  in  wj  is  the  same  as  the  combination  of  the  safe  in  the  actual 
world,  Wg  (line  12),  and  the  action  of  dialing  the  combination  of  the  safe  in  is  the  same 
as  the  action  of  dialing  the  combination  of  the  safe  in  Wg  (line  16). 

Since  John  is  at  the  same  place  as  the  safe  in  (line  17),  and  since  the  combination  of 
the  safe  is  a  is  a  possible  combination,  it  is  possible  for  John  to  dial  the  combination  of  the 
safe  in  wj  (line  18).  We  will  call  the  resulting  situation  W2.  Since  the  combination  dialed 
is,  in  fact,  the  combination  of  the  safe  in  wj,  the  safe  is  open  in  W2  (line  19).  So,  John 
dialing  the  combination  of  the  safe  in  wj  would  result  in  the  safe  being  open  (lines  20  -  23). 
Translating  everything  back  into  the  object  language,  John  knows  which  action  dialing  the 
combination  of  the  safe  is.  and  he  knows  that  dialing  the  combination  of  the  safe  will  result 
in  the  safe  being  open  (lines  24  •  26).  Therefore,  John  can  open  the  safe  by  dialing  the 
combination  (lines  27  and  28). 

The  major  point  of  this  example  is  that  in  deducing  that  John  can  open  the  safe,  we 
did  not  have  as  an  explicit  piece  of  knowledge  the  fact  that  knowing  the  combination  is  one 
of  the  requirements  for  opening  a  safe.  Instead  we  used  a  much  more  general  piece  of 
knowledge,  the  fact  that  in  order  to  achieve  any  goal  it  is  necessary  to  have  an  executable 
description  of  a  procedure  that  causes  the  goal  to  be  satisfied.  In  this  case,  the  combination 
of  the  safe  is  part  of  the  executable  description  of  the  procedure  for  opening  safes. 

In  this  section  we  have  looked  at  how  the  possibility  of  taking  effective  action  depends 
on  having  the  right  knowledge.  In  the  next  section  we  will  examine  how  actions  can  affect 
what  an  agent  knows. 


5.S  Formalizing  the  Effects  of  Action  on  Knowledge 

In  section  3.3  we  explained  how  our  theory  handles  the  effects  of  an  action  on  the 
knowledge  of  the  agent.  The  basic  idea  was  to  represent  these  effects  by  a  pattern  of 
accessibility  relationships  among  various  possible  worlds.  We  distinguished  actions  on  the 
basis  of  whether  or  not  they  are  knowledge-producing,  a  knowledge-producing  action  being 
one  where  after  performing  the  action  the  agent  would  know  more  about  the  resulting 
situation  than  he  did  before  performing  the  action.  We  showed  how  to  account  for  this 
distinction  in  terms  of  the  possible-world  semantics  for  knowledge  and  action. 

The  formalism  that  we  have  now  developed  is  adequate  to  capture  the  effects  on 
knowledge  of  both  types  of  actions.  As  an  example  of  an  action  which  is  not  knowledge- 
producing,  we  can  use  the  blocks  world  operation  Puton  that  we  formalized  in  section  5.1. 
We  can  extend  our  formalization  of  Puton  to  include  its  effect  on  the  knowledge  of  the  agent 
by  adding  the  following  axiom  to  those  we  have  already  given: 

P4.  Vapxpxj.wj.wg 

(R(:Oo(i|  ,:Puton(xj ^c2))iwl  »w2^  3 

Vw3(K(sj,w2,w3)  ■  3w4(K(«j,Wj,W4>  a  R(:Do(a|ltPuton(x|lX2».W4,w3)))) 

P4  says  that  if  w2  is  the  situation  which  results  from  «j  putting  X|  on  x2  in  wj,  then  the 
worlds  which  are  compatible  with  what  «j  knows  in  w2  are  exactly  those  worlds  (w3)  which 
are  the  result  of  «j  puting  xj  on  x2  in  one  of  the  worlds  (W4)  which  are  compatible  with 
what  knows  in  wj.  This  is  exactly  the  situation  that  was  illustrated  by  figure  3.2.  We 
can  think  of  •)  puting  X|  on  x2  as  having  the  effect  of  filtering  out  from  the  courses  of 
events  compatible  with  what  «|  knows  in  wj  alt  those  in  which  some  other  event  occurs  at 
that  point  in  time. 

Using  P4,  we  can  show  that  after  performing  Puton(A,B),  an  agent  would  know  that  A  is 


on  B.  In  this  example  we  will  not  be  interested  in  showing  that  the  agent  is  able  to  put  A 
on  B,  so  we  will  use  the  weak  modal  operator  for  actions.  Rest. 

Prove:  Tru«(Rotl  (Do(Hand,Puton(A,B)),Know(HandlOn(A,B)))) 


1. 

R(:Do(:Hand,:Puton(:A,:B)),WQ,W| ) 

Acs 

2. 

K(:Hand,W|,w2) 

Ass 

3. 

3w3(K(:Hand,Wo,w3j  A  R(:0o(iHand,:Puion(:A,:B}},W3,w2)) 

1.2.P4 

4. 

R(:0o(:Hand,:Puton(:A,:B)),W3,w2) 

3 

5. 

H(w2,:0n(:A,:B)} 

4,P2 

6. 

T(w2,0n(A,B)) 

5, L9a, LI  lb, LI 2b 

7. 

K(:Hand,W|,w2)  a  T(w2,0n(A,B)) 

Dis(2,6) 

8. 

T(W  |  ,Know  (Hand, 0n(A,B ))) 

7, LI  lb, Kl 

9.  R(:Do(:Hand,:Puton(:A,:B)),W0,Wj )  =>  T(Wj  ,Know(Hand,On(A,B)))  Dis(M) 

1 0.  T(W0,Resl  (Do(Hand,Pulon(A1B)),Know{Hand,On(A,B)))>  9, LI  1  b,L!2b,R6 

1 1 .  True  (Ret  I  (Oo(Hand,Puton(AlB)),Know(Hand,On(AlB))»  10, LI 

We  start  by  assuming  that  W]  is  the  world  which  results  from  Hand  putting  A  on  B  in 
W0  (line  1).  Notice  that  we  are  assuming  that  Hand,  A,  and  B  are  all  rigid  designators.  The 
assumption  that  Hand  is  a  rigid  designator  is  merely  a  convenience.  A  and  B.  on  the  other 
hand,  must  be  rigid  designators  for  the  conclusion  to  be  valid  in  the  exact  form  that  it  is 
stated.  If  A  and  B  were  not  rigid  designators.  Hand  might  not  recognize  them  as  referring  to 
the  objects  he  acted  upon,  so  we  would  have  to  substitute  something  like 

Exiat(7X  j  ,And(Eq(7X  ]  ,A),Exist(?X2^nd(Eq(7X2,B),Know(Hand(On(rX !  ,rX2))))))> 

for  Know(Hand,On(A,B))  in  the  conclusion.  This  would  make  the  proof  longer,  but  not  really 
any  harder. 

We  let  w2  be  a  typical  world  which  is  possible  according  to  what  Hand  knows  in  wj.  Pd 
implies  that  w2  must  be  the  result  of  Hand  putting  A  on  B  in  some  other  world,  say  W3  (line 
4).  So,  A  must  be  on  B  in  w2  (lines  5  •  6);  hence  in  wj,  Hand  knows  that  A  is  on  B  (tines  7  - 
8).  This  leads  to  the  conclusion  that  in  the  result  of  putting  A  on  B,  Hand  knows  that  A  is 
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on  8  (lines  9-11).  At  a  very  general  level,  the  argument  of  this  proof  is  that  Hand  knows 
that  he  has  put  A  on  8,  and  he  understands  the  effects  of  putting  A  on  8,  so  he  must  know 
that  A  is  on  8. 

The  analysis  we  presented  for  knowledge-producing  actions  is  similar  to  that  for  actions 
that  are  not  knowledge-producing  except  that  we  also  take  into  account  the  knowledge 
gained.  We  can  use  the  action  Dial  from  the  previous  section  as  an  example  of  knowledge- 
producing  action  if  we  assume  that  after  trying  to  open  a  safe  by  dialing  a  combination  the 
agent  knows  whether  he  has  succeeded.  This  can  be  a  genuine  increase  in  his  knowledge, 
since  he  might  not  know  beforehand  whether  he  would  succeed.  We  can  express  this  fact 
by  adding  D3  to  our  axioms  for  Dial: 

03.  Va|,X|lx2|W|lw2 

(R(:Do(a  j  ,tDial(x  j  ,x2)),w  j  ,w2)  a 
Vw3(K(ai,w2lw3)  ■  ((H(w2,:0p«n(x2))  ■  H(w3,:Open(x2»)  A 
3w^(K(a|,wj,W4j)  A  R(:0o(aj1:Dial(xj^f2»,W4,w3)»)) 

D3  describes  how  dialing  affects  the  knowledge  of  the  dialer.  Roughly  it  says  that  the 
agent  knows  he  has  done  the  dialing,  and  he  now  knows  whether  the  safe  is  open.  More 
precisely,  it  says  that  the  worlds  that  are  now  possible  as  far  as  he  knows  are  exactly  those 
which  are  the  result  of  doing  the  action  in  some  world  which  was  previously  possible 
according  to  what  he  knew,  and  which  agree  with  the  world  which  actually  results  from 
trying  to  open  the  safe  as  to  whether  the  safe  is  open.  This  is  the  type  of  situation  that  was 
pictured  in  figure  3.3. 

We  can  show  that  this  axiom  implies  that  after  trying  to  open  a  safe  an  agent  would 
know  whether  the  safe  were  open: 

Given:  True(Re*(Do(John,Di*l(C j  ,Sf  j  }),0pen(Sf  ^  >}) 

Prove:  T  rue  (Res  (0o(  John,Di  tl  (C  j  ,Sf  j  )),Know(John,Open($t  j )))) 


£ 
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1.  T(WQ,Res(Do(John,Dial(Cj,Sf|)),Open($fj)))  Given,Ll 

2.  3wj  (R(:Do(:John,:Di«l(:C j,sSf t  )),Wq,Wj)  a  H(wj,:Open(:$ij))  I ,R1,L1  lb,Ll2b,L9a 

3.  R(:Do(:John,:Di«l(:C|  ,:Sf  j  IJ.Wq.Wj )  2 

4.  H(Wj  ,:Open(:$f  j ))  2 

5.  K(:John(W|,w2)  Am 

6.  H(Wj  ,:Op«n(:S(|))  ■  H(w2,:Open(:Sf , ))  3,5,03 

7.  H(w2,:0p«n(:Sf|))  4,6 

8.  T(w2,0p«n(Sf|))  7,L1  lb,L9a 

9.  K(:John,Wj  ,w2)  =»  T(w2,0pen{Si] ))  Dis(S^) 

10.  T (W }  ,Know(John,0pan(S( j )))  9,L1  lb^Cl 

11.  T  (W0,R«*(Do(John,Dial(C1  ,$<  j )), Know  (John, 0pan($<j ))))  3, LI  lb,Ll2b,10,Rl 

1 2.  True  (Res (Do( John, Dial (C j  ,Sfj  )),Know(John,Open(Sf  j ))))  1 1 ,11 


In  the  first  case,  we  assume  that  John  dialing  the  combination  C|  on  the  safe  $f|  results 

in  the  safe  being  open.  Line  2  translates  this  premise  into  the  meta-language.  For  the  same 
reasons  as  in  the  previous  example,  we  assume  that  John,  Cj,  and  Sfj  are  rigid  designators. 
We  let  Wj  be  the  world  which  results  from  John  trying  to  open  S(j  (line  3),  so  the  safe  is 
open  in  W|  (line  i).  We  let  w2  be  a  typical  world  which  is  possible  according  to  what  John 
knows  in  Wj  (line  5).  D3  implies  that  w2  must  agree  with  Wj  as  to  whether  the  safe  is  open 
(line  6).  so  the  safe  must  be  open  in  w2  (lines  7  -  8).  Therefore,  in  W|  John  knows  that  the 
safe  is  open  (lines  9  •  10),  so  after  trying  to  open  the  safe,  John  knows  that  the  safe  is  open 
(lines  1 1  •  12). 


Given:  T rue  (Ras (Do( John, Dial (C  j  ,Sf  j  )),Not (0pen(Sf  | ))» 

Prove:  True(Res(Do(John,Dial(C  j  ,Slj  )),Know(John,No»(Open(Sf  j ))))) 


1 .  T(W0,Res(Do(John,Dial(C]  ,Sf  j  )),Not(0pen(Sf , ))))  Given, U 

2.  3w,  (R(:Do(:John,:Dial(:C| ,:Sf  j )),Wq,W| )  a  •4i(wl,:0pen(:$f1))  1  ,Ri  J.1  ibJ.12b,L6J.9a 

3.  R(:Do(: John,:Dial(:C |  ,:S( j  )),W0,W, )  2 

4.  -H<W,,:0pen(:Sf,»  2 

5.  K(:John,Wj,w2)  Ats 

6.  H(Wj,:Open(:SI,))«H(w2,:Open(:Sf|))  3,5,03 

7.  «H(w2,:0pen(:Sf|))  4,6 

8.  T(w2,Noi(0pen(S(| )))  7J.1  lb,L9a,L6 


4 


9.  K(sJohn,W,  ,w2)  a  T(w2,Not(Op«n(S(1 )))  Dit(5^) 

1 0.  T(Wj  ,Know(John,Not(Op«n(Sfj ))))  9, LI  1  b,Kl 

1 1 .  T(W0>RM(Oo(John,Dial(Ci  ,$f ,  )),Know(John,Not(Open(Sf} )))))  3,11 1 b,U 2b,10,Rl 

1 2.  T ruo  (R«s(Do(John,Dial(C  j  ,$<1  )),Know(John,Not(Open(Sf } )))))  11,11 

The  proof  of  the  second  case  is  identical  in  form  to  the  proof  of  the  first  case.  This 
time  we  are  given  that  the  safe  is  not  open  in  the  result  of  John  trying  to  open  it,  so  the 
safe  is  not  open  in  Wj  (line  A).  This  in  turn  implies  that  the  safe  is  not  open  in  w2  (line  7), 

so  after  trying  to  open  the  safe,  John  knows  that  the  safe  is  not  open  (line  12). 

A  more  interesting  example  is  the  second  of  our  benchmark  problems  from  chapter  1: 
showing  that  after  trying  to  open  the  safe  Sfj,  which  he  knows  to  be  locked,  by  dialing  C|, 
John  would  know  whether  Cj  is  the  combination  of  Sf|.  This  proof  depends  on  the  facts 
that  after  trying  to  open  the  safe,  John  knows  that  he  tried  to  open  the  safe,  he  knows 
whether  the  safe  is  open,  and  he  understands  how  the  safe  being  open  depends  on  whether 
the  combination  he  dialed  is  the  combination  of  the  safe.  In  addition,  John  must  know  that 
trying  to  open  the  safe  does  not  change  the  combination.  To  show  this  we  need  an 
additional  frame  axiom  for  Oial: 

D4.  VaI,xi,x2,w,,W2 

(R(:Do(aj  ,:Dial(xj  ,x2)),w(  ,w2)  » 

(Vini.trmj  (V(wj,int.trr«j)  ■  V(w2,int.trmj))  A 
Vint.p) «int.p|  /  jOpen(x2»  =>  (H(wj,int.pj)  ■  H(w2,in».pj ))))) 

D4  says  that  Oial  doesn't  affect  any  basic  function  or  relation  other  than  whether  the  safe  is 
open.  Therefore  Oial  does  not  change  the  combination  of  the  safe. 

We  will  divide  the  proof  into  two  cases,  according  to  whether  or  not  Cj  Is  the 
combination  of  Si}.  The  structure  of  the  possible  worlds  mentioned  in  the  proof  of  the  first 
case  is  illustrated  in  figure  5.3. 
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Figure  5.3  "Cy  is  the  combination  of  Sf,." 
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Given:  True  (Know  (John,  Not  (Op«n(Sf  j )))) 

True(Eq(C]lComb(Sf1))) 

Prove:  True  (Rest  (Do(John,Di«l(C  j  ,$f  j  »,Know(John,Eq{Cj  ,Comb(Sfj ))))) 

1 .  K(:John,W0,w  j )  »  «.H(wj,:0|>en(:Sfj)) 

2.  :Cj  -  V(W0,:Comb(:S<i )) 

3.  R(:Do(:John,:Di«l(:C  j  ,:Sf  j  »«Wq,W|  ) 

4.  H(w],:Open(:$f|)) 

5.  K(:John,wj,W2> 

6.  H(w  j  ,:Open(:Sf  j ))  ■  H(w2,:0pen(Sf  j » 

7.  H(w2,:Open(:Sf|)) 

8.  3w3(K(:John,W0,w3)  a  R(Oo{John,:Di«l(:Cll:Sf|»,W3,w2»  3,5,03 

9.  K(:John,W0,W3)  * 

10.  ■'H(W3,:0pen(:Sf  j )) 

11.  R(:Do(:John,!Diel(:Cj,:S<i)),W3lW2>  3 


Given, L 1  ,K 1  ,L4,L9 a,L I  lb 

Given,  ll,L13,U  lb, L12b 

An 

3,2,02 

Am 

3,5,03 

4,6 


12. 


1 1,02 


«:C,  t  V(W3,:C©mb(:$f  i )))  A  -H(W3,K)p*n(Sf  j )M  » 
-H(w2,:0p«n(:S(|)) 


13. 

<:C]  •  V(W3,:Comb(tSf] )))  v  H(W3,t0pan(Sf})) 

7,12 

14. 

tC)  >  V(W3,:Comb(:$f|)) 

10,13 

15. 

V(W3,:Comb(:Sf1))  •  V(w2,sComb(:Sf  j )) 

1 1,04 

16. 

:C,  >  V(w2,:Comb(:Sf])) 

14,15 

17. 

T(w2,Eq(C|,Comb(Sf| ))) 

16^.1  lb, L12bJ.13 

18. 

K(:John,Wj,w2)  o  T(w2,Eq(C],Comb($f|}}) 

Ois(5,l7) 

19. 

T  (w  |  ,Know(John,Eq(C  j  ,Comb(S!  j )))) 

18,LUb,Kl 

20. 

R(:Do(:John,:Di«l(:C|  ,:SI j  )),W0,w j )  » 

Dis(3,19) 

T(wj  ,Know(John,Eq(C]  ,C©mb(Slj )))) 

2 1 .  T(W0,R«sl  (Do(John,Di*l(C j  ,Sf  j  )),Know(John,Eq(Cj  ,Comb(Sf  j )))))  20, LI  tb,Ll 2b, R6 

22.  Tru«(R«sl  (Do(Johnl0ial(CI  ,S< j  )),Know(John,Eq(C1  ,Comb(S<j )))))  2 1  ,11 


Lines  I  and  2  translate  into  the  meta-language  the  premises  that  John  knows  the  safe  is 
locked  and  that  C]  is  the  combination  to  the  safe.  We  let  wj  be  the  result  of  John  trying  to 
open  the  safe  in  WQ  (line  3).  Since  C]  is  the  combination  to  the  safe,  the  safe  will  be  open 
in  W]  (line  4).  We  then  let  w2  be  a  typical  world  which  is  possible  according  to  what  John 
knows  in  wj  (line  5).  D3  implies  that  w2  must  agree  with  Wj  as  to  whether  the  safe  is  open 
(line  6),  so  the  safe  must  be  open  in  w2  (line  7).  D3  also  implies  that  w2  must  be  the  result 
John  trying  to  open  the  safe  in  some  world,  say  W3,  which  is  possible  according  to  what 
John  knows  in  W0  (lines  8-9,1 1).  Since  in  Wq,  John  knows  that  the  safe  is  locked,  the  safe 
must  be  locked  in  W3  (line  10).  But  according  to  D2,  if  Cj  were  not  the  combination  of  the 
safe  in  W3,  since  the  safe  is  locked  in  W3,  the  safe  would  still  be  locked  in  w2  (line  12). 
Since  the  safe  is  open  in  w2,  Cj  must  be  the  combination  of  the  safe  in  W3  (lines  13  •  14). 
Since  trying  to  open  a  safe  does  not  change  the  combination  (line  15),  Cj  is  still  the 
combination  of  the  safe  is  w2  (lines  16  -  17).  Therefore,  in  wj  John  knows  that  Cj  is  the 
combination  of  the  safe  (lines  18  •  I9>,  i.e,  after  trying  to  open  the  safe  John  knows  that  C{ 
is  the  combination  of  the  safe  (lines  20  -  22). 
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Given:  True(Know(John,Not(Open(Sf  j )))) 

True(Not(Eq(Cj  ,Comb(Sfj )))) 

Prove:  True(Re«l  (Do(John,Di«l(Cj  ,Sf  j )), Know  (John, Not(Eq(Cj  ,Comb(Sf  j )))))) 


1.  K(:John,WQ,wj )  »  -H(wj,.*Open(:S(j)) 

2.  :Cj  t  V(W0,:Comb(:S(1)) 


3. 

4. 

5. 

6. 

7. 

8. 

9. 

10. 
11. 
12. 

13. 

14. 

15. 

16. 
17. 


•H(WQ,:Open(:Sf  j )) 

R(:Do(:John,:Di«l(:Cj  ,:Sf  j  )),Wq,wj  ) 
-H(w],:0pen(:S(|)) 

K(:John,wj,w2) 

H(W],:0pen(:Sf])}  *  H(w2,K)pen(S(|)) 
•H(w2,:0pen(:Sf|» 


GivenfllJClJ.4,L9eJ.llb 
Given,ll  J.6J.13J.1  lb,L12b 
1,K2 
At* 

4,2,3,02 

An 

4.6.D3 

5,7 


3 W3 (K (: John, Wq, W3 >  a  R(0o(:John,:0iel(:C  j  ,:Sfj  )),W3,W2»4,6,D3 


R(:0o(:John,:0ial(:C|  ,:Slj  )),W3,w2) 

(:C|  «  V(W3,:Comb(:Sf j)))  a  H(w2.K)pen(:Sf,)) 
:Cj  /  V(W3,:Comb(:Sf])) 

V(W3l:Comb(:S(,))  -  V(w2,:Comb(:Sfj)) 

:Cj  t  V(w2,:Comb(:Sf,)) 

T(w2,Not(Eq(Cj  ,Comb(S(j }))) 

K(:John,w,  ,w2)  a  T(w2,Not(Eq(CjIComb(Sf  j))» 
T(w  |  ,Know(John,Not(Eq(C  j  ,Comb(Sf  j ))))) 

18.  R(:Do(:John,:Dial(:C,  ,:Sf  j  }),Wq,W|  )  » 

T(w ,  .Know  (John, Not  (Eq(C  |  ,Comb(Sf  j ))))) 

1 9.  T(W0,Resl  ( Do ( John, Dial (Cj  ,Sf , )), 

Know(John,Not(Eq(Cj  ,Comb(S(] )))))) 

20.  TroeCResl  (Oo(John,Dial(C,  ,S(1 )), 

Know  (John, Not  (Eq(C  j  ,Comb(S(, )))») 


9 

10,02 

8,11 

10,04 

12,13 

14,LUb,L12b,L13,L6 

0is(6,15) 

16, LI  lb, K1 
Dis(4,17) 

I8,L1  lb,L12b,R6 

19^.1 


The  second  case  is  proved  very  much  like  the  first  Figure  5.4  gives  the  possible-world 
structure  for  this  case.  Lines  1  and  2  translate  into  the  meta-language  the  premises  that 
John  knows  the  safe  is  locked  and  that  C|  is  not  the  combination  to  the  safe.  We  note  that 
since  John  knows  the  safe  is  locked,  the  safe  must  be  locked  (line  3).  We  let  W|  be  the 
result  of  John  trying  to  open  the  safe  in  Wq  (line  4).  Since  Cj  is  not  the  combination  to  the 
safe,  and  the  safe  is  locked  in  Wq,  the  safe  will  remain  locked  in  wj  (line  5).  We  then  let  w2 
be  a  typical  world  which  is  possible  according  to  what  John  knows  in  wj  (line  6).  D3 
implies  that  w2  must  agree  with  Wj  as  to  whether  the  safe  is  open  (line  7),  so  the  safe  must 


Figure  5.4  "C,  is  not  the  combination  of  Sf 

be  locked  in  w2  (line  8).  D3  also  implies  that  w2  must  be  the  result  John  trying  to  open  the 
safe  in  some  world,  say  W3,  which  is  possible  according  to  what  John  knows  in  W 0  (lines  9  - 
10).  According  to  D2.  if  Cj  were  the  combination  of  the  safe  in  W3,  the  safe  would  be  open 
in  w2  (line  1 1).  Since  the  safe  is  still  locked  in  w2,  Cj  must  not  be  the  combination  of  the 
safe  in  W3  (line  12).  Since  trying  to  open  a  safe  does  not  change  the  combination  (line  13), 
Cj  is  not  the  combination  of  the  safe  is  w2,  either  (lines  14  -  15).  Therefore,  in  W|  John 
knows  that  C]  is  not  the  combination  of  the  safe  (lines  16  •  17>,  i.e.,  after  trying  to  open  the 
safe  John  knows  that  C|  is  not  the  combination  of  the  safe  (lines  18  -  20). 

This  example  is  a  good  illustration  of  the  power  to  be  gained  from  using  a  rigorous 
logical  formalism.  The  conclusion  of  this  proof  was  not  explicitly,  or  even  consciously,  built- 
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in  to  the  axioms  used  in  the  proof.  With  the  more  ad  hoc  representation  schemes  that  are 
frequently  used  in  AI,  it  often  seems  that  an  additional  fact  is  required  for  each  new 
inference  that  is  made.  By  making  a  thorough  analysis  of  the  problem  domain  and  using  a 
powerful  deductive  formalism,  we  have  created  a  much  more  robust  system. 


5.4  An  Example  of  Acquiring  Knowledge  Required  for  an  Action 

We  conclude  this  chapter  by  considering  the  last  of  the  sample  problems  from  chapter  I. 
This  is  example  shows  how  one  step  of  a  plan  can  produce  knowledge  which  is  necessary  to 
carry  out  the  rest  of  the  plan.  One  way  of  obtaining  such  knowledge  is  to  read  it 
somewhere.  To  formalize  this,  we  need  a  new  action  Read,  a  predicate  Reads  to  say  that  an 
agent  can  read,  and  the  operator  Info,  to  say  what  information  the  thing  being  read  contains. 

INFI.  Vwjltrm.X|lexp| 

(T(wj .InfodrmjC] ,expj ))  ■  (expj  ■  V(Wj,:lnfo(D(wj,trmjij))))) 

RD1.  Va,^,,w,,w2 

Ow2(R(:Do(a  j  ,:Read(x j  )),wj  ,w2))  ■ 

(H(W],:Reads(aj))  A  H(wj,;At(al,Xj)))) 

RDSl.  Vw j  ,«j  (H{w| ,:Reads(aj ))  a  Vw2(K{s|,W|,w2)  o  H(w2,:Reada(aj )))) 

R02.  Vaj,xj,wj,w2 

(R(:Do(a j  ,:Read(x j  )),wj  ,w2)  » 

Vw3(K(at,w2,W3)  ■  ((V(w2,slnfo(x1))  ■  V(w3,:lnfo(xj)))  A 
3w4(K(»ltwltw4)  A  R(:Do(S| ,:Read(xj »|W4,w3))))) 

R03.  Va,tx,1wj,w2 

(R(:Do(aj  ,:R«ad(xj  )),wj  ,w2)  » 

(Vint.trm |  (V t  '  |  .int.trm j )  ■  V(w2,inUrmj))  A 
Vint.p|(H(W|,int.p|)  ■  H(w2,int.p1 )))) 

T o  represent  that  an  object  has  information  written  on  it,  we  introduce  the  operator  Info 
into  the  object  language.  lnfo(trm.x1,exp|)  will  mean  that  the  object-language  expression 
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•xpj  represents  the  information  written  on  the  referent  of  oxpj  is  a  meta-language 

variable  that  ranges  over  all  well-formed  object-language  expressions,  both  terms  and 
formulas.  It  is  important  to  realize  that  even  though  info  can  take  terms  as  arguments,  it  is 
not  an  ordinary  predicate.  For  example,  if  Fathor(John)  is  a  term  denoting  the  father  of 
John,  then  lnfo(P«p*r  j  ,Fath«r(John))  means  that  Paporj  has  written  on  it  some  expression 
(presumably  in  natural  language)  whose  meaning  is  represented  by  the  formal  expression 
Fathor(John).  If  Info  were  an  ordinary  predicate.  lnfo(Pap*rj  .Father (John))  would  have  to 

assert  some  relation  between  the  piece  of  paper  and  John's  father,  that  is,  between  the 
denotations  of  the  two  argument  expressions.  Here,  however,  we  have  a  relation  between 
the  denotation  of  the  first  argument  expression  and  the  second  argument  expression  itself 
(not  its  denotation).  It  might  be  more  intuitive  to  quote  the  second  argument  (e.g. 
lnfo(Papor |  ,Quoto(Fathor(John)))),  but  we  will  want  to  quantify  into  the  quoted  context,  and 
quotation  is  usually  interpreted  as  blocking  such  quantifications. 

One  of  the  advantages  of  working  with  both  a  meta-language  and  an  object  language  is 
that  we  can  introduce  an  operator  like  Info  whose  semantics  are  unlike  anything  we  have 
seen  before,  and  we  can  define  those  semantics  right  in  the  meta-language.  This  is  done  in 
INFI.  Notice  that  on  the  right  side  of  1NF1  the  first  argument  of  Info  is  evaluated,  but  the 
second  is  not.  Another  unusual  feature  of  INFI  is  that  :lnfo,  the  meta-language  correlate  of 
Info,  is  the  sort  of  expression  usually  associated  with  an  object-language  term.  We  might 
have  expected  the  right  side  of  to  be  H(wj  ,:lnfo(D(wj , trout  i  ),oxpj )).  Instead,  we  let 
V(w|,:lnfo(0(w|,trmjt}))  be  a  meta-language  term  which  denotes  the  information  contained 
in  the  referent  of  trnvxj  in  wj.  Info  is  treated  this  way  because  we  want  to  imply  that  oxpj 
represents  all  the  information  contained  in  xj.  slnfo(x|,pj),  however,  would  not  do  this,  so 
extra  axioms  would  be  required.  On  the  other  hand,  we  don't  want  to  have  Info  as  an 
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object-language  function,  because  its  referent  would  itself  be  an  object-language  expression. 
We  do  not  want  to  have  object-language  expressions  as  individuals  in  the  object  language, 
because  that  might  allow  the  introduction  of  self-referential  statements,  fading  to  the 
familiar  semantic  paradoxes  (e.g.  statements  like  This  statement  is  false.”). 

RDI  says  that  •]  can  read  X]  if  and  only  if  •]  can  read,  and  is  at  the  same  place  as 
X|.  RDS1  says  that  if  ij  can  read,  he  knows  that  he  can  read.  This  fact  is  necessary  for  an 
agent  to  be  able  to  reason  about  how  he  can  acquire  knowledge.  RD2  tells  how  reading 
something  rrue  affects  the  knowledge  of  the  reader.  It  says  that  if  w2  is  the  result  of  «j 

reading  xj  in  wj,  then  the  worlds  which  are  possible  according  to  what  «j  knows  in  w2  are 

exactly  those  worlds  which  satisfy  both  of  the  following  conditions:  First,  they  must  agree 
with  w2  as  to  what  information  is  contained  in  xj.  Second,  they  must  be  the  result  of 

reading  xj  in  some  world  which  is  possible  according  to  what  •]  knows  in  W|.  Informally, 
this  means  that  after  reading  X],  «j  knows  what  information  is  contained  in  xj,  and  he 
knows  that  he  has  read  xj. 

We  can  use  these  axioms  to  show  that  if  John  has  a  piece  of  paper  which  he  knows  has 
the  combination  of  the  safe  Si]  written  on  it,  he  can  open  the  safe  by  reading  the 

combination  from  the  piece  of  paper  and  then  dialing  it  on  the  safe.  The  premises  are  as 
follows: 

Given:  True(Ssfe(Sf| )) 

True(At(John,Sfj )) 

True  (At  ( John, Pprj)) 

True(Readt(John)) 

True  (Know  (John, Exist(TX  j  ,And(Eql?X  j  ,Comb($(  j  )),lnfo(Ppr  { ,TX  j ))))) 

The  meanings  of  the  first  four  premises  should  all  be  obvious.  These  are  the  conditions 
which  ensure  that  John  can  physically  perform  the  actions  required.  The  last  premise  is 


the  really  interesting  one.  It  says  that  John  knows  that  the  combination  of  the  safe  is 
written  on  the  piece  of  paper  Pprj.  The  interpretation  of  quantifiers  in  terms  of 

substituting  rigid  designators  enables  us  to  make  sense  out  of  quantifying  into  the  second 
argument  position  of  Info.  Technically,  this  violates  our  promise  not  to  use  the 
substitute  dl  analysis  of  quantification  in  any  way  that  goes  beyond  the  analysis  in  terms 

of  closures.  In  this  case,  however,  the  substitutional  analysis  is  exactly  what  we  want  since 

* 

there  must  be  some  linguistic  expression  written  on  the  paper.  If  we  were  using  the  closure 
approach,  we  would  need  a  special  axiom  for  this  case.  In  particular,  we  want  to  infer  that 
it  is  the  standard  name  of  the  combination  of  the  safe  that  is  written  on  the  paper.  We  can 
describe  this  as  a(V(W0,:Comb(sSfi))),  which  is  equivalent  to  the  expression  we  get  by 
eliminating  the  quantifier  (See  lines  5  and  6). 

The  possible-world  structure  for  the  proof  for  this  example  is  pictured  in  figure  5.5. 
This  example  provides  an  execellent  illustration  of  the  power  of  the  possible-world 
approach  to  reasoning  about  knowledge  and  action.  The  possible  worlds  mentioned  in  the 
proof  are  related  in  a  fairly  complicated  way  by  instances  of  K  and  R,  and  the  whole  pattern 
of  Interconnection  is  needed  to  produce  the  desired  conclusion. 

Prove:  True(Can(John,(Read(Ppri);  Oial(Comb<Sf |  >,Sf  j  )).Op«n(Sf | )» 

1.  :Safo(Sf, ) 

2.  H(W0,:At(:John,:Sf , )) 

3.  H(W0,:Ai(:John.:Ppr, )) 

4.  H(WQ,:Readc(:John)) 

5.  K(:John,W0,wj )  s»  3xj((xj  ■  V(wj,:Comb(:Sfj)))  A 

(o(xj )  •  V(W|  ,:lnfo(:Ppr j )))) 

6.  K(:John,W0lwj )  s  (e(V(w|,:Comb(:Sf  j)»  »  V(wj  ,:lnfo(:Ppr  j ))) 

7.  K(:John,W0,Wj )  d  H(w, ,:At(:John,:Sf , )) 

S.  K(:John,Wg,Wj )  »  H(w|,:At(:John,:Pprj)) 

9.  K(:John,Wg,W|)  s  H(wj,:Raads(:John)) 


Givon,LI,L9b,Ll  lb 
Givan,Ll,L9a,Ll  lb 
Given,  LI, .194,11  lb 
Given, Ll,L9a, LI  lb 
Given,Ll  ,K1  ,L1 1  b,L7, 
L2,L13,L12b,INFl 
5 

2, A1 

3, AI 

4, RDS1 


Lines  1  •  5  translate  the  premises  into  the  meta-language.  Line  6  is  a  restatement  of  line 


5.  From  the  premises  we  conclude  that  John  knows  that  he  is  at  the  same  place  as  Sf  j  and 
Pprj  (lines  7  •  8),  and  that  he  knows  that  he  can  read  (line  9). 


10. 

K(:John,W0,w1) 

Ass 

11. 

e(V(wj,:Comb(:Sfj)))  ■  V(wj,:ln(o(:Pprj)) 

6,10 

12. 

H(wj  ,:At(:John,:S<| )) 

7,10 

13. 

H(wj  ,:At(:John,:Ppr  j )) 

8,10 

14. 

H{w  j  ,:Reads(:  John)) 

9,10 

15. 

R(:Do(:John,:Read(:Ppr  |  )),w  j  ,W2) 

I3,14,RD1 

16. 

o(V(W2,:Comb(:Sf, )))  •  V(W2,:lnfo(:Ppr, )) 

11,15,04 

17. 

H(W2,:At(:John,:S1j )) 

12,15,04 

18. 

K(:JohnlW2lW3>  9  H(w3,:At(:John,iSfj)) 

I7.A1 

19. 

K(:John,W2,W3>  «  ((V(w2,:lnfo(Pprj))  ■  V(w3,:ln<o(:Ppr|)))  A 

15,RD2 

3w^(K(:John,W|,w^)  A  R(:Do(:John,:R««d(:Pprj  D.w^wg))) 


We  let  Wj  be  a  typical  world  which  is  possible  according  to  what  John  knows  in  Wq 
(line  10).  In  wj,  then,  the  information  written  on  Pprj  is  the  standard  description  of  the 
combination  of  the  safe  (line  1 1).  John  is  at  the  same  place  as  the  safe  and  the  piece  of 
paper  (lines  12  -  13),  and  John  can  read  (line  14).  Since  John  can  read  and  he  is  at  the 
same  place  as  the  piece  of  paper,  John  can  read  the  piece  of  paper,  resulting  in  situation  W2 
(line  15).  Since  reading  the  piece  of  paper  doesn't  change  either  what  is  on  the  paper  or  the 
combination  of  the  safe,  the  information  written  on  the  piece  of  paper  in  W2  is  the  standard 
description  of  the  combination  of  the  safe  in  W2  (line  18).  Also,  John  is  still  at  the  same 
place  as  the  safe  in  W2,  and  he  knows  that  this  is  true  (lines  17  -  18).  After  reading  the 
piece  of  paper,  John  knows  what  is  on  the  piece  of  paper,  and  that  he  has  read  it  (line  19). 


20. 

K(tJohn,W2,w3) 

Ass 

21. 

H(w3,tAt(tJohn,:SI| )) 

18,20 

22. 

V(W2,:lnfo(Pprj )) »  V(w3,:lnfo(>Ppri)) 

19,20 

23. 

3w^(K(]John,W|,W4)  A  R(:Oo(:John,:R#»d(:Ppr  j 

19,20 

24. 

K(:John,W|,W4) 

23 

25. 

K(:John,W0,W4) 

1 0,24,1(3 

26.  «(V(W4,:Comb(s$f,)»-  V(W4,slnfo(sPpr, ))  6,25 

27.  R(:Oo(:John,:R«ad(:Ppr])),W4,w3))  23 

28.  fi(V(w3,:Comb(:Sf])))>  V(w3,:lnto(:Ppr|))  26,27,04 

29.  fi(V(w3,:Comb(:Sf|)))«  V(W2,:ln(o(:Ppr,))  22,28 

30.  a(V(W2,^omb(:S(|)))>B{V(w3,:Comb{tS(i)))  16,29 

31.  V(W2,:Comb(:Sf1))>V(w3,:Comb(:Sf1))  304.10 

32.  :0ial(:V(W2,:Comb(:S(1)),:S(|)  ■  :Di»l(:V(w3,:Comb(:S(j)),:Sfj)  31 

33.  0(W2,0ial(Comb(Sf  ,),$<!))  ■  0(w3,Diai(Comb(SfI),Sf1))  324-11  b4.12a.Ll 2b 


34.  0(w3,a(0(W2,0ial(Cemb(Sf  ]  ),Sf  1 })))  ■  0(w3,Dial(Comb(Sf1  ),Sf, ))  33,1.10 

35.  T(w3,E4(a(D(W2,0ial(Comb(Sfi  ),S(j  ))),0ial(Comb(5f ,  ),S<, )))  34.L13 

We  let  w3  be  a  typical  world  which  is  possible  according  to  what  John  knows  in  W2 
(line  20).  We  conclude  that  in  w3  John  is  at  the  same  place  as  the  safe  (line  21)  and  the 
piece  of  paper  has  the  same  information  on  it  as  it  does  in  W2  (line  22).  and  that  w3  is  the 
result  of  John  reading  the  piece  of  paper  in  some  world  which  is  possible  according  to  what 
John  knows  in  wj  (line  23).  We  call  this  world  W4  (line  24).  Since  in  W0  John  knows 
whether  he  knows  something.  W4  is  also  possible  according  to  what  John  knows  in  W0  (line 
25).  From  this  we  conclude  that  the  information  written  on  the  piece  of  paper  in  W4  is  the 
standard  description  of  the  combination  of  the  safe  (line  26).  Since  w3  is  the  result  of  John 
reading  the  piece  of  paper  in  W4  (line  27),  the  information  written  on  the  piece  of  paper  in 
w3  is  the  standard  description  of  the  combination  of  the  safe  (line  28).  Since  what  is 
written  on  the  piece  of  paper  in  w3  is  the  same  as  in  W2,  and  in  both  cases  that  is  the 
standard  description  of  the  of  the  combination  of  the  safe  in  that  world,  the  two 
descriptions  must  be  the  same  (lines  29  •  30),  so  the  combination  of  the  safe  must  be  the 
same  in  w3  as  it  is  in  W2  (line  31).  This  allows  us  to  conclude  that  dialing  the  combination 
of  the  safe  in  w3  is  the  same  action  as  dialing  the  combination  of  the  safe  in  W2  (lines  32  • 
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36.  R(:0o(:John,:0«al(V(w3,tC6mb(:S(I}),i$f1))lW3,W5)  1,21,01 

37.  H(W5,:0p«n(:Sf1»  36,02 

38.  T(W5,0p«n($fi))  37,L1  lb,L9a 

39.  R(:Oo(0(W2,John),:Oial(V(w3,:Comb(:S(1)),!Sf,)),w3,W5)  36,11  lb 

40.  R(:0o(0(w3,a(0(W2^ohfl))),:0<al(V(w3,:Comb(tS(|  »,:Sf,  )),w3,W5)  39, LI 0 

41 .  R(0(w3,Do(o(D(W2lJohn)),Dial{Comb(Sf,  >,SI,  »),w3,W5)  40, LI  lb,LI2«,L12b 

42.  T(w3,R«s(Do(o{0(W2,Johfl))lOial(Comb(S(,  ),S«(  )),0pan(Sf  j )))  38,41  ,R1 

43.  T(w3,And(Eq(s{D(W2,0ia!(Comb(Sf  |  ),Sf  j  ))),Dial(Comb(S(|  ),Sf| )),  35,42, L2 

RM(0o(«(0(W2rtohn)),0ial(Comb(Sf  t  ),S<,  )),0pan(Sf  j '))) 


Since  in  w3  John  is  at  the  same  place  as  the  safe,  it  is  possible  for  him  to  dial  the 
combination  of  the  safe,  and  we  call  the  resulting  situation  W5  (line  36).  Since  the 
combination  John  dials  is  the  combination  of  the  safe,  the  safe  is  open  in  W5  (lines  37  •  38). 
Hence,  it  is  physically  possible  in  W3  for  John  to  open  the  safe  by  dialing  the  combination 

(lines  39  •  42).  Line  43  conjoins  this  fact  with  the  previous  conclusion  that  dialing  the 
combination  of  the  safe  is  the  same  action  in  w3  as  it  is  in  W2. 


44.  K(:John,W2,w3) »  Dis(20,43) 

T(w3,And(Eq(ffi(0(W2lDial(Comb(Sf ,  ),Sfj  ))),Di«l(Comb(Sf  j  ),Sf  j )), 
Res(Do{e(D(W2,John)),Di.l(Comb(Sf ,  ),$( ,  )),0pen(Sf  j )))) 

45.  K(D(W2lo(0(W0,John))),W2,w3) »  44,LllbJ.10 

T(w3,And(Eq(o(D(W2,0i«l(Comb(Sf  j  ),$f  j  ))),Di»l(Comb(Sf  j  ),SI  j », 
R«s(Oo(o(0(W2rjoHn)),Oial(Comb(S(j),Sf|)),Op«n($(})))) 

46.  T(W2,Know(e(0(W0,John))  45, K1 

T  (w3,And(Eq(s(0(W2,Dial(Comb(S<  j  ),$(  j  ))),0ial(Comb(St  j  ),Sfj )), 
Res(Do(o(D(W2rlohn)),Oial(Comb(Sf ,  ),Sf ,  )),0pen($l, )))) 

47.  T(W2,Can(a(0(W0John)),0ial(Comb(SfJ),Sf1),0p«n(S(|)))  46, Cl 

Since  w3  is  an  arbitrarily  chosen  world  which  is  possible  according  to  what  John  knows 
in  W2,  we  conclude  that  in  W2  John  knows  what  action  dialing  the  combination  of  the  safe 
is,  and  he  knows  that  dialing  the  combination  of  the  safe  will  result  in  the  safe  being  open 
(lines  44  -  46).  So  in  W2  John  can  open  the  safe  by  dialing  the  combination  of  the  safe 
(line  47). 
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48.  0(W0,R«ad(Ppr| ))  ■  D(w, ,Road(Ppr, ))  LI  lb, LI 2b 

49.  0(wj  ,o(D(W0,R*«d(Ppr, ))))  ■  D{wj  ,R«ad(Ppr  j ))  48,110 

50.  T(w1,E<|(A(0{W0,R«ad(Ppr1m,R«*d(Ppr1))  494-13 

51.  R(:0o{0(w1,s(D(W0,John}))l:R«ad(:Pprt)),W|,W2)  154-1 154-10 

52.  R(D(w1,Oo(a(D(W0,John))lRaad(Ppr|m,wi,W2)  51,Ulb4.12b 

53.  T(w1,Ros(Do(a(D(W0,Jehn)),Read(Pprl)),  52,474*1 

Can(fi(0(W0,John)),0ial(Comb(Sfj  ),Sf  j  ),0pan(S<j )))) 

54.  T(wI^nd(Eq(o(D(W0)Raad(Ppr,»),Raad{Ppr1)),  50,534-2 

R«s(Do(fi(0(W0,John)),Raad{Ppr| )), 


Can(B(D(W0rJohn)),Dial(Comb(S(]  >,Sft  ),0p*n(Sf, ))))) 

By  making  P prj  a  rigid  designator,  we  imply  that  John  knows  what  object  has  the 
combination  of  the  safe  written  on  it,  so  reading  the  the  piece  of  paper  in  wj  is  the  same 
action  as  reading  the  piece  of  paper  in  Wq  (lines  48  -  50).  We  already  know  that  in  W2 
John  can  open  the  safe  by  dialing  the  combination  and  that  W2  is  the  result  of  reading  the 
piece  of  paper  in  wj,  so  in  the  result  of  reading  the  piece  of  paper  in  w(l  John  can  open 
the  safe  by  dialing  the  combination  (lines  51  -  53).  Line  54  conjoins  this  fact  with  the 
previous  conclusion  that  reading  the  piece  of  paper  in  wj  is  the  same  action  as  reading  the 
piece  of  paper  in  WQ. 


55.  K(:John,W0,Wj )  a  T(w,  ,And(Eq(a(0(W0,Re«d(Ppr,  ))),R#ad(Ppr , )),  Dis(10,54) 

Res(Do(o(D(W0,John)),Read(Ppr| )), 

Can(e(D(W0,John)),Dial(Comb(S<  l  ),S(  ( ),0pon(S<  t ))))) 

56.  T(W0,Know(John,And(Eq(fi(0(W0>R«ad(Ppr1  ))),R«ad(Ppr, )),  55, LI  1  b,Kl 

R«s(Do(o(D(W0,John)),Raad(Ppr  | )), 

Can(o{D(W0, John)), Dial  (Comb(Sf,  ),Sf ,  ),0pon(SI, )))))) 

57.  T(W0,Can(John,R«ad(Ppr,)1  56, Cl 

Can(e(D(W0,John)),Dial(Comb(Sf  i  ),Sf  j  ),0p«n(Sf , )))) 

58.  T (W0,Can( John,(Road(Ppr j );  Dial(Comb(Sf ,  ),S(,  )),0p«n(Sf1 )))  57.C2 

59.  T ruo (Can  (John, (R«ad(Ppr | );  Dial(Comb(Sf ,  ),Sf ,  )),0pon(Sf , )))  58, LI 


Since  wj  is  an  arbitrarily  chosen  world  which  is  possible  according  to  what  John  knows 
in  W0,  we  conclude  that  in  W q  John  knows  what  action  reading  the  piece  of  paper  is,  and 
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he  knows  that  reading  the  piece  of  paper  will  result  in  a  situation  where  he  can  open  the 
safe  by  dialing  the  combination  (lines  55  -  56).  So  in  W0.  by  reading  the  piece  of  paper 

John  can  bring  about  a  situation  where  he  can  open  the  safe  by  dialing  the  combination 
(line  57).  Finally,  John  can  open  the  safe  by  first  reading  the  piece  of  paper  and  then 
dialing  the  combination  of  the  safe  (lines  58  -  59). 

This  section  concludes  the  discussion  of  purely  representational  and  logical  issues.  We 
have  presented  a  formalism  that  allows  us  to  represent  and  reason  with  information  about 
what  someone  knows,  information  about  the  effects  of  actions,  and  information  about  the 
interactions  between  the  two.  While  we  have  used  axioms  describing  the  properties  of 
particular  actions  and  predicates,  much  of  the  power  of  the  system  comes  from  its  ability  to 
make  use  of  general  principles  about  knowledge  and  action.  In  the  rest  of  the  thesis  we  will 
examine  the  problems  involved  in  designing  procedures  to  do  this  reasoning  automatically. 
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6.  Automating  Deductions  about  Knowledge 

6,1  Procedural  Deduction  and  First-order  Logic 

In  this  chapter,  we  will  discuss  the  problem  of  how  to  algorithmically  generate  a 
deduction  of  a  desired  conclusion  from  a  group  of  facts  involving  knowledge  and  action. 
We  will  begin  by  looking  at  some  general  considerations  in  the  area  of  automatic  deduction. 
This  subject  was  reviewed  in  detail  in  Moore  (1975),  and  on  many  points  the  reader  may 
refer  to  this  source  for  further  discussion. 

Before  going  further,  we  should  consider  the  possibility  that  we  have  made  an 
insurmountable  mistake  by  choosing  first-order  logic  as  the  basis  of  our  representation.  It  is 
often  argued  in  the  AI  literature  (Minsky.  1974)  (Hewitt,  1975)  (Smith,  1977)  that  there  is 
something  fundamentally  wrong  with  formal  logic  (and  its  semantics)  as  a  representation  of 
knowledge,  and  that  many  of  the  problems  of  creating  reasoning  programs  are  more  easily 
handled  by  using  Frames,  Actors,  or  some  other  representation  scheme  instead. 

It  is  certainly  true  that  traditional  logic  is  limited  in  many  ways.  Notions  of  plausible 
inference  and  retracting  conclusions  in  the  face  of  better  evidence  do  not  fit  comfortably 
into  the  framework  of  model-theoretic  semantics,  as  we  have  already  seen.  However,  I 
believe  that  these  deficiencies  must  be  remedied  by  extending  logic,  not  replacing  it  Any 
system  adequate  for  representing  the  knowledge  of  an  Intelligent  being  must  surely  be  able 
to: 


(1)  Say  that  something  has  a  certain  property  without  saying  which  thing  has  that 
property. 

(2)  Say  that  everything  in  a  certain  class  has  a  certain  property  without  saying  what 
everything  in  that  class  is. 

(3)  Say  that  at  least  one  of  two  statements  is  true  without  saying  which  statement  is 
true. 
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(4)  Explicitly  say  that  a  statement  is  false. 

(5)  Either  settle  or  leave  open  to  doubt  whether  two  non-identical  expressions  name 
the  same  object 

Any  representation  scheme  that  has  all  these  abilities  will  have  at  least  a  subset  which  is 
isomorphic  to  first-order  logic,  and  for  which  model-theoretic  semantics  will  be  an 
acceptable,  if  not  total,  explanation.  Furthermore,  it  is  precl:?ly  the  difficult  problems  of 
reasoning  with  quantifiers,  disjunction,  and  equality  that  have  not  been  dealt  with 
adequately  by  any  of  the  proposed  alternatives  to  logic 

It  is  important  to  note  that  in  reasoning  about  knowledge  and  action  we  have  to  face 
most  of  the  problems  of  reasoning  in  first-order  logic  Often  A I  systems  avoid  this  by 
embodying  simplifying  assumptions  about  the  logical  structure  of  the  system’s  knowledge. 
The  most  frequent  such  assumption  is  that  the  system  has  a  complete  description  of  the 
problem  domain  and  the  problem  situation.  The  blocks  world  reasoning  component  of 
Winograd's  (1971)  SHRDLU  is  the  paradigm  example  of  this  type  of  system.  In  SHRDLU, 
questions  of  the  form  3xj(P(xj))  or  Vxj(P(xj))  are  answered  by  checking  whether  the  system 
knows  of  an  object  that  satisfies  P,  or  whether  every  object  the  system  knows  about  satisfies 
P;  any  two  terms  are  assumed  to  represent  different  objects  unless  they  can  be  evaluated  to 
the  same  expression;  and  any  statement  which  is  cannot  be  shown  to  be  true  is  assumed  to 
be  false.  In  chapter  1  of  Moore  (1975)  it  is  shown  how  these  assumptions  are  virtually  built 
into  PLANNER  and  related  AI  problem-solving  languages. 

We  cannot  make  any  of  these  assumptions,  however.  The  examples  we  looked  at  in  the 
preceding  chapters  make  use  of  the  full  logical  power  of  our  formalism.  In  particular,  we 
depend  crucially  on  being  able  to  quantify  over  an  infinite  set  of  possible  worlds,  we  reason 
explicitly  about  the  equality  of  terms,  and  we  seek  positive  evidence  for  inferring  statements 
to  be  false. 


These  observations  narrow  the  range  of  possibilities  open  to  us.  One  extreme  approach 
would  be  to  devise  an  ad  hoc  set  of  inference  procedures  for  exactly  the  inferences  we  think 
we  will  need  to  make.  This  approach,  although  it  sometimes  produces  impressive 
performance,  has  serious  problems.  There  is  no  reason  to  assume  that  the  techniques  used 
will  generalize  to  other  problem  domains,  or  even  to  other  problems  in  the  same  domain. 
Moreover,  as  the  system  is  expanded  to  handle  more  and  more  situations,  it  can  bog  down 
in  searching  for  rules  or  procedures  that  apply  to  the  particular  situation  at  hand. 

The  other  extreme  would  be  to  use  one  of  the  uniform  proof  procedures  for  first-order 
logic  and  the  axioms  described  in  chapters  4  and  5.  Experience  has  shown,  however,  that 
for  even  a  moderate  number  of  axioms  this  approach  can  be  very  inefficient,  especially 
when  many  of  the  axioms  are  not  needed  for  the  solution  of  the  problem.  A  number  of 
possible  reasons  for  this  are  discussed  in  Moore  (1975). 

We  will  try  to  steer  a  middle  course  between  these  two  extremes.  We  will  describe  an 
approach  which  uses  a  general  proof  procedure,  but  which  augments  that  procedure  with 
domain-specific  knowledge  of  how  certain  facts  are  to  be  used.  We  will  call  systems  that 
take  this  approach  procedural  deduction  systems. 

In  taking  this  type  of  approach,  we  have  to  decide  what  constitutes  "cheating".  In  an  ad 
hoc  approach,  nothing  is  considered  to  be  cheating,  but  this  leaves  open  the  possibility  that 
the  answers  to  specific  problems  are  directly  built  into  supposedly  general  problem-solving 
techniques.  In  a  strict  approach  based  on  a  uniform  proof  procedure,  including  anything 
other  than  first-order  axioms  is  considered  cheating. 

Neither  of  these  attitudes  seems  to  be  quite  right.  We  want  to  design  programs  which 
are  expert  at  reasoning  about  knowledge  and  action.  Certainly  part  of  that  expertise  is 
knowing  how  and  when  to  use  a  given  fact,  but  this  kind  of  knowledge  is  not  accessible  to  a 
uniform  proof  procedure.  At  the  same  time,  we  don’t  want  to  require  that  the  system  have 
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special  knowledge  in  order  to  solve  specific  problems;  we  seek  generality  at  least  across  the 
problem  domain.  Therefore,  problem-specific  assertions  and  goals  will  be  represented  as 
expressions  in  pure  first-order  logic,  and  specific  control  information  will  be  provided  only 
for  facts  which  have  domain-wide  applicability. 

6.2  Outline  of  a  Procedural  Deduction  System 

The  procedural  deduction  system  we  will  use  can  be  characterized  as  being  a  natural- 
deduction  system  which  uses  both  backward  and  forward  chaining,  handles  quantifiers  by 
means  of  Skolemization  and  unification,  and  has  some  limited  ability  to  handle  equality. 
We  will  begin  by  describing  how  the  system  works  given  a  complex  goal  and  data-base  of 
simple  assertions.  We  will  then  go  on  to  describe  how  complex  assertions  are  used  as  either 
backward-chaining  or  forward-chaining  rules  of  inference. 

The  most  basic  operation  in  any  deduction  system  is  matching.  An  assertion  satisfies  a 
goal  just  in  case  they  match.  For  our  matching  routine  we  will  use  the  unification 
procedure  from  resolution  theorem  proving.  (See  Chang  and  Lee  (1973)  for  a  review  of 
this  field.)  Two  expressions  match  if  and  only  if  there  is  a  substitution  for  the  variables  in 
each  expression  which  makes  the  expressions  identical.  For  instance,  the  goal  P(xj,A) 
matches  the  assertion  PfB^)  because  they  can  be  made  identical  by  substituting  B  for  xj  in 
the  goal  and  A  for  x2  in  the  assertion.  It  should  be  obvious  that  the  substitution  must  be 

uniform  within  each  expression.  That  is,  every  instance  of  a  particular  variable  in  one  of. 
the  expressions  must  receive  the  same  value.  If  the  two  expressions  being  unified  contain 
the  same  variables,  it  may  be  necessary  to  change  the  variables  in  one  of  them  to  avoid 
confusion  in  specifying  the  substitution. 

Frequently,  we  will  need  to  know  what  substitution  was  used  to  unify  the  expressions. 
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When  that  Is  the  case,  and  when  there  is  more  that  one  possible  unifying  substitution,  we 
will  pick  the  most  general  (i.e.  least  restricting).  This  is  called  the  most  general  unifier.  For 
example,  Pfej^)  and  P(x3,A)  can  be  unified  by  substituting  A  for  every  variable,  but  it 
would  be  less  restrictive  to  substitute  >3  for  xj  and  A  for  >3  The  most  general  unifier  is 
guaranteed  to  be  unique  down  to  the  choice  of  variable  names. 

Quantifiers  are  handled  by  Skolemization.  Whenever  a  formula  of  the  form  Vx(P(x))  is 
asserted,  it  will  be  replaced  by  the  formula  P(x).  Whenever  a  formula  of  the  form 
3x(P(x,yj,...,yn)  is  asserted,  where  yj,...,yn  are  the  only  free  variables  in  the  formula,  we 
replace  the  formula  by  P<F(y|,...,yn),yj,...,yn)  where  F  is  a  newly  created  function  symbol.  If 
there  are  no  free  variables  in  the  formula,  3x(P(x))  becomes  P(F)  where  F  is  a  newly  created 
constant  symbol.  These  two  cases  can  be  combined  if  we  think  of  constants  as  being 
functions  of  no  arguments. 

The  function  symbol  F  is  called  a  Skolem  function.  The  intuitive  idea  behind  the 
introduction  of  Skolem  functions  is  that  if  we  know  that  some  object  satisfies  the  formula 
P(x),  we  can  give  that  object  a  name  such  as  F.  If  that  name  is  not  used  anywhere  else  in 
the  system,  then  we  are  in  no  danger  of  proving  anything  from  P(F)  that  we  couldn't  have 
proved  from  3x(P(x)).  If  there  are  free  variables  in  the  formula,  F  must  be  a  function  of 
those  variables  to  allow  for  the  possibility  that  for  each  assignment  of  values  to  those 
variables,  there  is  a  different  object  which  makes  P(x)  true. 

For  quantifiers  in  goals,  the  process  is  reversed.  Any  goal  of  the  form  3x(P(x))  will  be 
replaced  by  the  goal  P(x).  A  goal  of  the  form  Vx(P(x,yi,...,yn),  where  yj,...,yn  are  the  only 

free  variables  in  the  goal,  will  be  replaced  by  the  goal  P(F(yj,.~,yn),yj,.~,yn),  where  F  is  a 

newly  created  function  symbol.  The  intuitive  basis  for  replacing  universal  quantifiers  by  ! 

Skolem  functions  in  goals  is  that  if  we  can  prove  that  the  arbitrarily  selected  object  named  1 

- 

by  the  Skolem  function  satisfies  P(x).  then  everything  must  satisfy  P(x). 
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To  see  how  Skolemization  Interacts  with  unification,  consider  how  we  can  prove 
VxQy(P(x,y)))  from  3u(Vv(P(v,u))).  By  Skolemization,  the  goal  gets  converted  into  P(F,y),  and 
the  assertion  gets  converted  into  P(v,G),  where  F  and  G  are  newly  created  Slcotem  constants. 
The  goal  and  the  assertion  can  be  made  to  match  by  substituting  F  for  v  and  G  for  y. 
Notice  that  we  cannot  make  the  converse  inference,  which  would  be  invalid.  That  is.  we 
cannot  infer  3u(Vv(P(v<u)))  directly  from  Vx(3y(Px,y)».  In  th.s  case,  the  goal  is  converted 
into  P(G(u),u),  and  the  assertion  is  converted  into  P(x,F(x».  If  we  try  to  unify  these  two 
formulas,  we  have  to  substitute  G(u)  forx  which  makes  F(x)  into  F(G(u)).  But  this 
expression  then  has  to  be  unified  with  u,  and  there  is  obviously  no  substitution  for  u  which 
will  make  it  identical  with  F(G(u)),  so  the  match  fails. 

We  will  elaborate  this  system  slightly  to  handle  the  typed  variables,  functions,  and 
constants  introduced  into  our  logical  formalism  In  chapters  4  and  5.  We  will  restrict 
unification  so  that  two  expressions  match  only  if  they  are  of  the  same  type.  Furthermore. 
Skolem  functions  and  constants  will  always  be  of  the  same  type  as  the  variables  they 
replace. 

This  method  of  handling  quantifiers  is  essentially  the  same  as  is  used  in  resolution 
theorem  provers,  except  that  we  treat  goals  directly,  rather  than  doing  proof  by 
contradiction.  Unification  of  Skolemized  formulas  is  known  to  be  a  logically  complete 
treatment  of  quantifiers  (see  Chang  and  Lee  (1973)).  In  our  treatment  of  propositional 
connectives,  however,  we  will  give  up  completeness  in  order  to  simplify  our  approach. 

In  proving  complex  goals  built  up  using  propositional  connectives,  we  will  use  an 
elaboration  of  the  standard  And/Or-tree  approach.  A  goal  of  the  form  (P  v  Q)  will  be 
replaced  by  two  independent  goals  P  and  Q.  If  either  of  these  goals  is  satisfied,  then  the 
original  goal  is  satisfied.  Conjunctive  goals  are  more  complicated.  If  we  have  (P(x)  a  Q(x)> 
as  a  goal,  we  not  only  have  to  prove  P(x)  and  prove  Q(x),  but  we  have  to  make  sure  that  x 


receives  the  same  value  in  each  proof.  That  is,  if  we  prove  P(x)  by  matching  the  assertion 
P(A),  we  have  to  then  prove  Q(A)  in  order  to  satisfy  the  original  goal.  So  our  procedure  for 
handling  conjunctive  goals  is  as  follows:  To  prove  CPj  a~a  Pn),  first  prove  Pj,  then  prove 

each  Pj,  i  >  1,  in  order  using  (he  bindings  for  free  variables  obtained  from  the  proof  of  Pj.j. 

The  original  goal  is  satisfied  if  and  only  if  all  of  the  subgoals  are  satisfied.  This  method  of 
solving  conjunctive  goals  is  called  splitting. 

We  will  also  allow  implications  and  biconditional*  to  occur  as  goals,  and  these  will  be 
proved  using  natural  deduction.  We  will  have  several  different  ways  of  writing 
implications  and  biconditionals,  corresponding  to  their  different  procedural  interpretations 
as  assertions,  but  all  variants  will  be  treated  the  same  when  they  occur  as  goals.  A 
biconditional  goal,  e.g.  (P  <•>  Q)  will  be  replaced  by  an  equivalent  conjunction  of 
implications,  ((P  ->  Q)  a  (Q  ->  P)>.  This  new  goal  will  then  be  attacked  using  splitting.  An 
implication,  such  as  (P  ->  Q)  will  be  proved  by  asserting  the  antecedent  P  and  proving  the 
consequent  Q  using  any  assertions  derived  from  P  and  previously  known  facts.  The  proof 
of  (P  •>  Q)  succeeds  if  the  proof  of  Q  succeeds  or  if  asserting  P  generates  a  contradiction.  If 
(P  ->  Q)  is  being  proved  as  a  subgoal  of  a  branch  of  a  split,  then  the  assertion  of  P  is  local 
to  that  branch  of  the  split.  The  data  base  of  assertions  must  be  returned  to  its  former  state 
before  the  next  branch  of  the  split  is  attacked.  If  we  are  trying  to  prove  UP  ->  0)  a  R)  we 
first  assert  P  and  try  to  prove  Q.  Then  we  must  remove  the  effects  of  asserting  P  before 
trying  to  prove  R,  or  else  we  will  be  proving  only  the  weaker  condition  (P  ->  (Q  a  R». 

Finally,  we  will  treat  negations  in  both  goats  and  assertions  by  pushing  them  down  to 
the  atomic  level.  We  will  replace  -(P  v  Q)  by  {-P  a  -Q),  *>{P  a  Q)  by  (-P  v  -Q),  x(p  ->  Q)  by  (P 
a  »Q),  -<p  <->  Q)  by  ((P  a  xQ)  v  (Q  a  xP)>,  and  x(xP)  by  P.  At  the  atomic  level,  negation  will 
be  handled  by  the  matcher.  So  just  as  P(x)  matches  P(A),  xP(x)  matches  xP(A). 

We  will  augment  these  methods  with  some  simplification  and  deletion  procedures.  The 


simplification  rules  are  based  on  recognizing  certain  contradictory  or  tautologous 
subexpressions  in  goals  and  assertions.  The  easiest  way  to  state  these  rules  is  to  introduce 
the  special  proposition  symbols  T  for  true  and  F  for  false.  The  simplification  rules  are  as 
follows:  Replace  any  conjunction  containing  both  a  formula  and  its  negation  by  F.  Replace 
any  disjunction  containing  both  a  formula  and  its  negation  by  T.  Then  replace  (T  v  P)  by 
T,  (T  a  P)  by  P,  (T  ->  P)  by  P,  (P  ->  T)  by  T,  (T  <->  P>  or  <->  T)  by  P.  and  T  by  F. 
Replace  (F  a  P)  by  F ,  (F  v  P)  by  P.  (F  ->  P)  by  T,  (P  ->  F)  by  ■*,  (F  <->  P)  or  (P  <->  F)  by 
-P,  and  -F  by  T. 

If  an  entire  assertion  simplifies  to  T,  it  is  a  tautology  which  can  be  discarded.  If  an 
assertion  simplifies  to  F  and  it  occurs  as  the  result  of  asserting  the  antecedent  of  an 
implication  we  are  trying  to  prove,  then  the  implication  is  proved,  otherwise  it  indicates  that 
the  premises  of  the  problem  are  inconsistent.  If  a  goal  simplifies  to  T  then  the  goal  has 
been  solved.  If  a  goal  simplifies  to  F  then  it  is  self-contradictory  and  should  be  abandoned. 

The  other  deletion  rules  that  we  will  use  are  to  delete  repeated  instances  of  assertions 
and  goals,  and  goals  that  are  contradicted  by  a  single  assertion.  We  could  use  more 
elaborate  techniques  of  this  type  based  on  the  subsumption  procedure  used  in  resolution 
systems,  but  these  simple  methods  are  sufficient  for  our  examples. 

6.3  Procedural  Interpretation  of  Complex  Assertions 

Complex  assertions  will  be  broken  down  less  than  complex  goals.  We  have  already 
explained  the  handling  of  quantifiers  and  negation  in  complex  assertions.  Conjunction  will 
be  handled  quite  simply  by  replacing  any  assertion  of  the  form  (P  a  0)  by  the  two 
independent  assertions  P  and  Q.  This  leaves  us  with  disjunctions,  implications,  and 
biconditionals  still  to  treat.  We  will  use  these  logical  expressions  as  domain-specific 
inference  rules.  By  specifying  control  information  in  these  rules,  we  justify  calling  our 
deductive  system  procedural. 


There  are  two  types  of  control  information  that  we  will  use.  The  first  of  these  derives 
from  the  original  work  on  PLANNER,  where  Hewitt  (1972)  pointed  out  that  an  assertion  of 
the  form  (P  »  Q)  has  two  very  natural  interpretations  as  an  inference  procedure,  either 
assert  Q  whenever  P  is  asserted,  or  in  order  to  prove  Q,  try  to  prove  P.  The  first  of  these 
two  methods  is  usually  called  forward  chaining,  and  the  second,  backward  chaining;  so  we 
will  refer  to  implication  assertions  used  in  these  ways  as  forward-chaining  or  backward- 
chaining  rules,  respectively. 

The  same  observation  holds  for  the  contrapositive  form,  {-Q  a  -P),  so  there  are  two 
more  interpretations  in  the  list  of  possibilities;  either  assert  -P  whenever  -Q  is  asserted,  or  in 
order  to  prove  -P,  try  to  prove  -Q.  In  many  situations  choosing  a  set  of  procedural 
interpretations  for  axioms  of  the  form  (P  a  Q)  is  the  most  important  way  of  controlling  the 
size  of  the  space  that  must  be  searched  in  making  a  deduction. 

Most  deductive  systems  do  all  their  reasoning  by  backward  chaining  from  the  goal. 
This  is  done  because  unrestricted  forward  inference  will  frequently  produce  large  numbers 
of  formulas  that  have  nothing  to  do  with  the  current  goal.  This  problem  would  be 
especially  severe  in  a  large  data-base  containing  many  types  of  knowledge.  Unrestricted 
backward  inference  at  least  produces  subgoals  which  are  relevant  to  the  main  goal.  There 
are  many  cases,  however,  when  very  large  backward-chaining  searches  can  be  eliminated  by 
allowing  limited  forward  deduction. 

One  type  of  situation  where  this  is  true  is  reasoning  about  membership  in  a 
hierarchically  structured  set  of  classes.  For  instance,  we  might  chose  to  represent  the  fact 
that  cats  are  mammals  by  the  formula  9  Mammal(xj)).  We  would  have  similar 

formulas  to  represent  the  facts  that  all  dogs  are  mammals  and  that  all  mammals  are  animals. 
In  the  set  of  formulas  defining  this  hierarchy,  a  predicate  can  occur  many  times  on  the  right 
side  of  an  implication,  but  only  once  on  the  left 


Suppose  ve  knew  that  Felix  is  a  cat,  and  we  want  to  deduce  something  about  Felix  that 
requires  showing  that  Felix  is  an  animal.  If  we  interpret  the  axioms  that  define  the 
hierarchy  as  backward-chaining  rules,  then  to  show  that  Felix  is  an  animal,  we  may  have  to 
search  through  most  of  the  kinds  of  animals  we  know  about  before  hitting  upon  the 
assertion  that  Felix  is  a  cat.  If,  on  the  other  hand,  we  interpret  the  axioms  as  forward¬ 
chaining  rules,  then  for  each  individual  we  know  about  we  would  make  a  few  assertions 
about  what  classes  in  the  hierarchy  the  individual  belongs  to,  but  there  would  be  no 
searching  at  all  on  goals.  So  at  the  cost  of  a  few  assertions  per  individual,  we  can  entirely 
avoid  searching  a  potentially  very  large  space. 

Another  case  where  forward  deduction  is  desirable  is  where  inferences  can  form  a  chain 
which  is  finite  in  the  forward  direction,  but  infinite  in  the  backward  direction.  For  instance 
one  of  the  axioms  of  number  theory  is  that  if  the  successor  of  X|  is  less  than  x2>  then  xj  is 
less  than  x2:  ((S(xj)  <  x2)  a  (xj  <  x2».  If  we  interpret  this  axiom  as  a  backward-chaining 
rule,  it  will  generate  infinitely  many  subgoals  whenever  it  is  invoked.  The  goal  (A  <  B)  will 
generate  the  subgoal  (S(A)  <  8)),  which  will  in  turn  generate  the  subgoal  (S(S(A))  <  B),  etc 
If  we  interpret  the  axiom  as  a  forward-chaining  rule,  however,  the  number  of  assertions 
generated  will  be  limited  by  the  depth  of  nesting  of  S's  in  the  original  assertion.  The 
assertion  (S(S(A))  <  B)  will  generate  (he  assertion  (S(A)  <  B).  which  will  generate  the 
assertion  (A  <  B)  and  then  stop. 

Other  cases  where  forward  deduction  is  useful  include  expanding  defined  terms  by  their 
definitions,  putting  a  problem  description  into  canonical  form,  or  making  a  change  of 
representation  for  a  problem.  This  last  case  would  include  our  translating  from  the  modal 
representation  of  facts  about  knowledge  and  action  to  the  possible-world  representation. 

It  is  frequently  the  case  that  if  there  is  a  strong  argument  for  interpreting  an  implication 
(P  a  Q)  as  a  forward-chaining  or  backward-chaining  rule  there  is  an  equally  strong  dual 


argument  for  interpreting  the  contrapositive  form  (-Q  a  -P)  in  the  opposite  way.  For 
instance,  if  we  know  that  some  individual  is  not  an  animal,  we  would  not  want  to  have  to 
assert  all  the  different  kinds  of  animals  that  it  is  not  Moreover,  to  prove  by  backward 
chaining  that  this  individual  is  not  a  cat  would  require  checking  only  the  few  classes  in  the 
hierarchy  above  cat  So  the  formula  a  -CatUj ))  should  definitely  be 

interpreted  as  a  backward-chaining  rule.  Similarly,  a  moment's  thought  will  show  that  (»(xj 
<  x2)  »  ~(S(xj)  <  x2))  should  also  be  interpreted  as  a  backward-chaining  rule. 

We  will  use  different  notations  to  specify  different  combinations  of  possible  procedural 
interpretations  of  complex  assertions: 

1.  (P  ->  Q):  If  P  is  ever  asserted,  also  assert  Q. 

2.  (Q  <-  P):  In  order  to  prove  0.  try  to  prove  P. 

3.  <P  ■>  Qb  If  P  is  ever  asserted,  also  assert  0.  and  in  order  to  prove  -P,  try  to  prove 
-Q- 

4.  (Q  <•  P):  In  order  to  prove  0.  try  to  prove  P,  and  if  -Q  is  ever  asserted,  also  assert 

-P. 

5.  (P  v  Q):  In  order  to  prove  P,  try  to  prove  *Q,  and  in  order  to  prove  Q,  try  to 
prove  -P. 

I  -  4  are  all  different  procedural  interpretations  for  (P  a  Q).  I  and  2  are  simple 
interpretations  as  a  single  forward-chaining  or  backward-chaining  rule.  3  and  4  reflect  our 
observation  that  frequently  if  an  implication  is  most  efficiently  used  as  a  forward-chaining 
rule,  its  contrapositive  form  is  most  efficiently  used  as  a  backward  chaining  rule,  and  vice- 
versa.  5  can  also  be  thought  of  as  a  procedural  interpretation  of  implication  because  (P  v  Q) 
is  equivalent  to  (-P  a  Q)  and  {-<5  a  p).  5  would  be  useful  when  there  is  no  particular  reason 
to  use  forward  chaining,  so  backwards  chaining  is  used  in  both  cases.  5  can  be  generalized 
to  handle  more  than  two  disjuncts  as  follows: 
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5*.  (P |  v...v  Pn):  In  order  to  prove  Pj,  try  to  prove  (*Pj  a„.a  ~Pj_j  a  ~Pj.j  a„a  *Pn). 

That  is,  if  the  goal  we  wish  to  prove  is  one  of  a  number  of  possibilities,  we  should  try  to 
prove  that  all  the  other  possibilities  are  false. 

We  also  have  two  procedural  interpretations  of  biconditionals: 

6.  (P  <»>  0):  In  order  to  prove  P,  -P,  Q,  or  -Q,  try  to  prove  Q,  -Q,  P,  or  -P, 
respectively,  but  do  not  immediately  reapply  this  rule. 

7.  (P  <■>  Q):  In  all  goals  and  assertions  replace  any  active  occurrence  of  P  by  Q 

6  interprets  a  (P  <■>  Q)  as  a  set  of  backward  chaining  rules  for  transforming  a  goal 
containing  P  to  a  goal  containing  Q,  or  vice-versa.  Since  the  rules  go  both  directions,  it  is 
useful  to  restrict  (P  <■>  Q)  from  being  applied  twice  in  a  row  to  prevent  regenerating  the 
original  goal.  7  is  used  when  we  always  want  to  reason  in  terms  of  Q  rather  than  P.  It  not 
only  generates  a  new  formula  containing  Q,  but  also  eliminates  the  formula  containing  P. 
By  an  active  occurrence  of  P,  we  mean  an  occurrence  that  is  currently  a  candidate  for  being 
matched.  This  would  include  the  current  goal,  all  atomic  assertions,  the  left  hand  side  of  I  * 
4  and  7,  and  all  of  5  and  6.  We  restrict  our  attention  to  active  occurrences,  so  that  if  we 
have  an  assertion  like  (P  ■>  Q)  and  Q  is  a  very  complicated  expression,  we  don’t  have  to  go 
rummaging  around  in  Q  looking  for  possible  substitutions  until  we  actually  try  to  use  Q. 

All  of  the  interpretations  of  I  •  7  have  been  stated  in  purely  propositional  terms,  but 
they  should  be  taken  to  cover  cases  with  variables  as  well.  For  example,  if  we  had  the  goal 
Q(A)  and  the  assertion  (Q(x)  <■  P(x»,  we  would  generate  the  goal  P(A).  The  result  of 
applying  a  rule  must  of  course  take  into  account  the  substitution  that  was  used  to  make  the 
match  succeed. 

Notice  that  because  our  matcher  works  only  on  atomic  expressions,  the  formulas  in 
active  positions  in  assertions  in  the  forms  given  in  I  *  7,  must  be  atomic  expressions  in 


order  to  be  used.  For  instance,  we  have  not  specified  any  way  to  use  an  assertion  like  ((P  a 
Q>  •>  R).  This  is  not  as  much  of  a  restriction  as  it  seems,  however,  because  formulas  can 
always  be  re-written  or  expanded,  so  that  they  fit  into  the  patterns  we  handle.  <(P  a  Q)  •>  R) 
can  be  re-written  as  (P  ■>  <Q  •>  R)).  We  could  work  out  a  set  of  rules  for  doing  this 
automatically,  or  we  could  extend  our  matching  rules  to  handle  more  complex  assertions,  but 
since  all  of  our  examples  can  be  handled  by  the  current  rules,  we  won't  bother  to  do  so. 

The  other  type  of  control  information  we  will  want  to  put  into  assertions  is  syntactic 
restrictions  on  the  use  of  those  assertions.  For  example,  one  very  concise  way  to  say  that 
John  knows  whether  P  is  true  is  (K{:John,WQ,wj)  o  (T(wj,P)  ■  T(Wq,P»).  That  is,  any  world 

which  is  compatible  with  what  John  knows  in  the  actual  world  must  agree  with  the  actual 
world  as  to  whether  P  is  true.  A  straight-forward  way  of  using  a  fact  of  this  type  would  be 
as  a  forward-chaining  rule:  whenever  we  have  an  assertion  that  a  world,  say  Wj,  is 
compatible  with  what  John  knows  in  the  actual  world  Wq,  i.e.  K(tJohn,WQ,Wj),  we  would 
assert  that  Wj  agrees  with  Wq  as  to  whether  P  is  true. 

Recall,  however,  that  since  anything  that  is  known  by  someone  must  be  true,  we  have 
axiom  K2,  Vaj.w,  which  says  that  every  world  is  compatible  with  what 

anyone  knows  in  that  world.  Combining  this  assertion  with  the  rule  representing  the  fact 
that  John  knows  whether  P  is  true  would  produce  the  tautologous  conclusion  (T(W0,P)  ■ 
T(W0,P)). 

We  could  add  to  our  rules  for  recognizing  tautologies  a  check  for  this  pattern,  but  a 
simpler  solution  to  this  problem  would  be  to  put  a  syntactic  test  into  the  assertion  to  prevent 
application  of  the  rule  if  the  expression  being  bound  to  wj  is  Wq.  The  representation  of 

the  assertion  might  then  look  like: 

(K(:John,W0,w1)/[W0  /  w,)  •>  (T(w,J>)  <->  T(W0,P»>. 


The  square  brackets  indicate  that  [WQ  /  wj]  is  a  syntactic  test  and  not  a  goal  to  be 
proved.  The  test  indicated  by  /  is  satisfied  if  after  the  pattern  match  the  arguments  of  / 
are  not  unifiable.  Another  way  to  achieve  the  same  effect  would  be  to  add  a  piece  of 
advice  (as  in  PLANNER)  not  to  apply  this  rule  to  axiom  K2.  Neither  of  these  restrictions 
can  be  expressed  as  a  pure  logical  formula.  The  closest  we  could  come  would  be  to  write 
something  like: 

((K(:John,W0,w, )  A  <W0  /  w,))  a  (T(w,,P)  ■  T(W0,P)». 

This  is  too  strong,  however,  since  it  would  require  us  to  prooe  that  W0  and  W|  are  not 
the  same  possible  world  before  applying  the  the  rule.  To  avoid  the  problems  we  discussed 
above,  we  only  need  to  do  a  simple  test  to  see  whether  they  are  the  same  expression.  In 
section  7.1  we  will  see  a  more  complex  example  involving  the  axiom  D3,  where  the  use  of  a 
syntactic  restriction  is  used  to  prevent  a  forward-chaining  rule  from  generating  infinitely 
many  assertions. 

The  use  of  syntactic  restrictions  is  also  helpful  in  solving  a  problem  relating  to  our 
treatment  of  (P  <■>  Q).  We  have  interpreted  this  as  a  rule  to  replace  all  occurrences  of  P  by 
Q.  But  what  happens  if  the  occurrence  of  P  being  replaced  is  more  general  than  the 
instance  in  the  replacement  rule?  For  example,  suppose  we  have  the  assertions  (P(x)  v  R(x)) 
and  (P(A)  <•>  Q(A)>.  We  can  generate  the  new  assertion  (Q(A)  v  RCA))  by  substituting  A  for  x 
in  the  first  assertion,  and  then  substituting  Q(A)  for  P(A),  but  if  we  delete  the  old  assertion, 
we  will  lose  the  information  that  (P(x)  v  R(x))  is  true  for  values  of  x  other  than  A  We  could 
just  leave  the  old  assertion  as  it  is,  but  this  would  create  an  undesirable  redundancy.  If  we 
came  along  later  with  the  goal  P(y),  we  would  match  (P(x)  v  Q(x)>  and  generate  the  goal 
*R(y).  We  would  also  match  (P(A)  <■>  0(A))  and  generate  the  goal  P(A)  which  would  in  turn 
match  (Q(A)  v  R(A))  and  generate  the  goal  -Q(A).  But  this  is  a  special  case  of  a  goal  we 
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have  already  generated,  and  is  therefore  redundant  In  complicated  situations  this  can  lead 
to  massive  generation  of  redundant  goals. 

A  solution  to  this  problem  is  to  have  the  replacement  rule  change  the  first  assertion  to 
be  (P(x)  v  R(x))/[x  /  AJ.  That  is,  we  do  effectively  delete  the  particular  instance  of  the 
assertion  that  our  replacement  rule  applies  to,  by  putting  a  syntactic  restriction  on  the 
assertion  not  to  match  that  instance.  With  this  procedure,  if  we  have  as  a  goal  P (y)  we  will 
ultimately  generate  one  goal  which  is  -R(A)  and  another  goal  which  is  -R(x)/[x  /  A]  these 
two  goals  are  mutually  exclusive  as  to  the  patterns  they  will  match,  so  the  redundancy  is 
eliminated. 

Except  for  the  equality  rules  to  be  discussed  in  the  next  section,  these  are  all  the 
inference  rules  that  we  will  use  in  our  deduction  system.  As  they  stand,  they  are  far  from 
forming  a  logically  complete  system.  The  most  glaring  deficiency  is  an  inability  to  do 
reasoning  by  cases.  That  is.  even  if  we  have  asserted  (P  <■  Q),  (P  <•  R),  and  (0  v  R).  we 
cannot  deduce  P.  Our  system  can  be  modified  in  a  relatively  straightforward  way  to  handle 
reasoning  by  cases  by  changing  the  treatment  of  disjunctive  assertions  to  a  splitting 
procedure  which  is  the  dual  of  the  one  we  are  using  for  goals.  A  system  of  this  type 
requires  a  much  more  complicated  control  structure  than  we  wish  to  use,  and  since  none  of 
our  sample  problems  involve  reasoning  by  cases,  it  did  not  seem  worth  the  effort  to 
describe.  To  see  what  is  required  for  such  a  system,  see  Nevins  (1974).  Nevins's  system  is 
quite  similar  to  ours,  but  he  does  not  impose  as  much  control  as  we  do  over  the  use  of 
complex  expressions,  and  he  does  not  use  syntactic  restrictions  at  all. 

Alternatively,  we  could  have  described  a  much  simpler  system  closer  in  spirit  to 
resolution,  but  conjunctive  goals  which  include  implications  would  not  be  handled  as 
naturally  as  in  a  system  based  on  splitting.  This  is  particularly  important  in  our  domain, 
since  every  attempt  to  prove  that  someone  knows  something  generates  a  goal  of  the  form 
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(K(iA,W|,W2)  T(W2|P».  For  a  comparison  of  splitting-based  systems  to  resolution-style 

systems,  see  chapter  3  of  Moore  (1975). 

The  global  control  strategy  we  will  use  is  simply  depth-first  search.  We  could  put  in 
some  heuristics  to  try  to  be  more  intelligent,  but  they  would  be  unnecessary.  The  point  is 
that  just  by  the  choice  of  procedural  interpretations  of  our  domain-specific  axioms  and  the 
use  of  syntactic  restrictions,  we  can  constrain  the  search  space  for  our  examples  so  tightly 
that  the  order  in  which  the  space  is  searched  does  not  matter  very  much. 

6.4  Inference  Rules  for  Equality 

In  this  section  we  give  the  inference  rules  for  reasoning  about  equality.  The  first  rules 
embody  the  fact  that  everything  is  equal  to  itself: 

1.  Replace  an,  expression  of  the  form  (A  ■  A)  by  T. 

2.  Replace  any  expression  of  the  form  (A  /  A)  by  F. 

3.  If  (A  •>  fa;  is  a  goal  where  A  and  B  are  unifiable,  solve  the  goal  by  unifying  A  and  B. 

The  first  two  rules  should  require  no  explanation.  The  point  of  actually  carrying  out  the 
unification  in  the  third  rule  is  that  the  goal  (A  ■  B)  may  have  been  generated  by  splitting  a 
conjunctive  goal,  and  the  variable  bindings  created  by  the  unification  may  be  required  by 
the  other  conjuncts  of  the  goal  which  was  split. 

One  practical  problem  in  reasoning  about  equality  in  an  AI  system  is  that  typically 
there  are  large  numbers  of  specific  individuals  that  the  system  knows  about  and  has  names 
for.  In  the  blocks  world  every  block  usually  has  an  "internal*  name,  and  in  circuit  analysis 
systems  every  component  is  usually  given  a  unique  identifier.  The  problem  this  creates  is 
that  to  reason  using  these  identifiers,  the  system  needs  to  know  that  they  refer  to  distinct 
individuals.  For  instance,  suppose  there  are  three  blocks,  A,  B,  and  C,  and  B  is  put  on  C. 
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To  be  able  to  infer  that  A  is  still  in  its  original  location,  the  system  must  not  only 
understand  the  effects  of  putting  one  block  on  another,  it  must  also  know  that  A  and  B  are 
not  the  same  block. 

In  standard  logic  the  only  way  of  indicating  that  A  and  B  are  not  the  same  is  to  have  a 
specific  axiom  (A  /  B).  To  avoid  cluttering  up  our  system  with  large  numbers  of  axioms  of 
this  form,  we  can  make  use  of  the  notion  of  a  standard  name  for  an  individual  which  we 
introduced  in  section  2.5.  If  we  assume  that  each  individual  has  only  one  standard  name, 
then  it  follows  that  two  syntactically  distinct  standard  names  must  name  different 
individuals.  We  will  designate  certain  constants  in  our  formalism  as  being  standard  names, 
and  we  will  build  into  our  system  the  assumption  that  two  distinct  standard  names  cannot 
be  equal.  This  fact  gives  us  the  following  two  rules: 

4.  If  A  and  B  are  different  standard  names,  replace  (A  ■  B)  by  F. 

5.  If  A  and  B  are  different  standard  names,  replace  (A  t  B)  by  T. 

We  also  have  functions  that  act  as  constructors  of  standard  names.  A  standard  name 
constructor  is  a  function  symbol  such  that  a  term  consisting  of  the  function  symbol  applied 
to  standard  names  is  itself  a  standard  name.  For  instance,  sDf«l(sC| ,sSf j  >  is  the  standard 

name  of  the  action  of  dialing  the  combination  named  by  :C|  on  the  safe  named  by  s$f  j.  Just 
in  case  tCj  and  sSf  j  are  themselves  standard  names. 

In  an  actual  implementation  we  would  probably  pick  some  notational  convention  to 
distinguish  standard  names  from  other  terms.  In  our  examples,  however,  we  will  not  use 
any  special  notation,  but  will  simply  point  out  when  we  are  assuming  that  an  expression  is  a 
standard  name.  We  will  note,  however,  that  the  meta-language  terms  that  we  use  to  denote 
object-language  expressions  will  be  regarded  as  standard  names  of  those  expressions.  Also, 
the  terms  that  denote  intensional  objects  will  be  the  standard  names  of  those  objects. 


Since  there  are  several  levels  to  deal  with,  we  must  be  careful  which  level  we  are  in.  For 
example.  CombfSf  j )  is  the  standard  name  of  the  object-language  expression  which  means 
"the  combination  of  Sfj".  That  is,  we  know  implicitly  that  Comb(Sfj)  t  CombfSfj).  because 
the  two  terms  denote  different  object-language  expressions.  tComb(t$f|)  is  the  standard 
name  of  the  intentional  object  corresponding  to  the  combination  of  the  safe  named  by  sSfj, 
just  in  case  Sfj  is  the  standard  name  of  the  safe.  V(W0lK^mb(rS<j»  refers  to  the  actual 
combination  of  the  safe,  so  it  can’t  be  a  standard  name  since  it  might  be  the  case  that 
V(W0,:Comb(:Sf  j  M  ■  V(Wo,:C©mb(:$i2))-  That  is,  two  different  safes  can  have  the  same 
combination. 

There  are  four  special  rules  for  standard  name  constructors: 

6.  If  F  and  G  are  different  standard  name  constructors,  replace  all  expressions  of  the 
form  F(Ait...,An)  ■  G(Bj,...,Bn)  by  F. 

7.  If  F  and  G  are  different  standard  name  constructors,  replace  all  expressions  of  the 
form  F(Ajt...,An)  /  G<B B„)  by  T. 

8.  If  F  is  a  standard  name  constructor,  replace  all  expressions  of  the  form  F(Aj,...,An) 

■  F(B | ,...,Bn)  by  ((A|  ■  Bj)  A.,7\(An  ■  Bn». 

9.  If  F  is  a  standard  name  constructor,  replace  all  expressions  of  the  form  F(Aj,..^An) 
l»  F(B j,..., Bn)  by  ((A,  /  8! >  v...v(A„  /  Bn». 

After  all  of  the  previous  rules  have  been  applied,  the  following  more  general  rules  are 
applied: 


10.  If  (A  ■  B)  is  an  assertion,  replace  all  active  occurrences  of  A  by  B,  unless  A  is  a 
standard  name,  in  which  case,  replace  all  active  occurrences  of  B  by  A 

1 1.  If  (A  /  B)  is  a  goal,  for  each  assertion  of  the  form  P(A),  generate  the  goal  -P(B). 
If  that  cannot  be  proved,  for  each  assertion  of  the  form  P(B),  generate  the  goal 
-P(A). 

12.  If  F(A]r..^n)  ■  F(B j , — (Bn)  is  a  goal,  generate  (Aj  ■  Bj)  a_a  (An  ■  B„)  as  a  goal. 


13.  If  F(A|,...,An)  i  F(Blr..,Bn)  is  an  assertion,  also  assert  (A|  t  Bj)  v„.v  (An  /  Bn). 

Rule  10  is  the  standard  equality  substitution  rule.  It  rewrites  all  expressions  involving 
one  of  the  terms  as  expressions  involving  the  other.  When  given  a  chance,  it  prefers  to 
state  things  in  terms  of  standard  names,  because  this  cuts  down  the  possibilities  for  further 
equality  substitutions,  and  also  because  there  is  usually  more  known  about  an  object  under 
its  standard  name  than  under  other  descriptions.  This  rule  needs  the  same  modification  as 
the  replacement  rule  for  <■>.  If  the  expression  that  would  be  replaced  is  more  general  than 
the  replacement  rule,  the  old  expression  is  not  deleted,  but  is  instead  modified  by  a  syntactic 
restriction.  For  instance,  if  F(A)  ■  8  is  applied  to  the  formula  P(F(x)),  the  result  is  the  two 
formulas  P(B)  and  P(F(x))/[x  /  A). 

Rule  1 1  is  the  dual  of  equality  substituition.  The  idea  is  that  two  individuals  cannot  be 
the  same  if  they  differ  in  some  property.  Since  this  rule  is  so  general  it  should  be  tried  only 
as  a  last  resort.  It  probably  could  be  tightened  up,  but  since  no  applications  of  it  will  be 
made  in  our  examples,  there  is  little  motivation  to  do  so. 

As  with  the  rules  in  the  previous  section,  these  rules  for  equality  are  incomplete, 
although  they  are  adequate  for  our  examples.  In  particular,  equalities  which  are  part  of  a 
larger  assertion,  e.g.  ((A  ■  B)  v  P),  and  equalities  which  permute  expressions,  e.g.  (xi«x2  • 
*2**1  )•  are  not  adequately  handled.  These  problems  are  discussed  more  fully  in  chapter  4 
of  Moore  (1975). 

6.5  Procedural  Interpretation  of  the  Axioms  for  Knowledge 

In  this  section  we  will  give  procedural  interpretations  to  the  basic  axioms  for  knowledge. 
(The  procedural  versions  of  all  our  axioms  are  listed  for  reference  in  appendix  B.)  For 
axioms  LI  •  LIS  the  procedural  interpretations  are  quite  straightforward.  Each  of  these 
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axioms  or  schemas  specifies  an  equivalence  between  an  object-language  expression  and  its 
meta-language  interpretation.  The  procedural  interpretation  of  these  axioms  will  be  simply 
to  replace  the  object-language  expression  by  its  meta-language  equivalent 

U .  Tru«(p| )  <■>  T(W0,pj ) 

L2.  T(wj,(And(pj,p2))  <■>  (T(wj,Pi)  A  T(w],p2)) 

L3.  T(wj ,(Or(pj,p2»)  <«>  (T(wj,p|>  v  T(w,,p2» 

L4r  T(w|,(p|  ->  p2»  <■>  (T(wj,pj)  ->  T(wj,p2» 

L4b.  T(wj,(pj  <-  p2)>  <■>  (T(wj,p|)  <-  T(wj,p2» 

L4c.  T(wj,(pj  •>  p2))  <■>  (T(wj,pj)  ■>  T(wj,p2)) 

L4d.  T(wj,(pj  <■  p2))  <■>  (T(wj,P|)  <■  T(w|tp2» 

L5a.  T(wj,(pj  <■>  p2))  <■>  (T(wj,p|)  <■>  T(w{lp2)) 

L5b.  T(wj,(pj  <■>  p2))  <■>  (T(W|,pj)  <■>  T(wj,p2)) 

L6.  T(wj  ,Not(pj ))  <•>  -T(wj  ,P]  > 

L7.  T(wj  .Exist (7Sj,PJ)  <•>  3si<T<w,  .PfeU; J/TSjJ)) 

L8.  T(w,  .AIKTSj.P))  <■>  Vsjdlw,  .PlnUjJ/TSjJ)) 

L9a.  T(wj  ,P(trmj . lrmn)J  <■>  H(w,  ,:P(D(wj  ,lrmj  ),  .,D(w,  ,trmn))) 

if  P  is  not  an  essential  property  of  the  things  it  is  true  of. 

19b.  T (w j ,P(trm j ,...,trmn))]  <■>  :P(0(w]ltrm|),...t0(w|,lrmn)) 
if  P  is  an  essential  property  of  the  things  it  is  true  of. 

LlOa.  D(wj,fi(xj))  ■  xj 

Li  Ob.  (s(X|)  ■  fi(x2))  <•>  (xj  ■  x2) 

Lila.  D(wj.Cnst)  •  V(w,,:Cnst)  if  Cnsl  is  not  a  rigid  designator. 

U  ib.  DCwj.Cnst)]  ■  sCnsl  if  CmI  is  a  rigid  designator. 

Li  2a.  D(W{  ,...,trmn))  •  V{w j ,:F(D(w j ,trmj  ,trmn)) 

if  F  is  not  a  rigid  function. 


LI 2b.  D(w|,F(lrmit...,trmn))  ■  :F (0(w j ,trm j ),„,D(w j ,trmn)) 
if  F  is  a  rigid  function. 

L13.  T(W|,Eq(trm|,trm2»  <■>  (0(wj,trm|)  ■  D(wj,trm2)) 

These  axioms  are  basically  straightforward  translation  rules,  but  there  are  a  couple  of 
interesting  points.  First,  since  the  meta-language  now  has  several  forms  for  implications 
and  biconditionals,  we  have  augmented  the  object-language  to  contain  these  same  forms. 
Although  we  have  used  the  same  symbols  in  both  the  object  language  and  the  meta¬ 
language,  the  context  will  always  disambiguate  their  use. 

Second,  we  have  introduced  LtOb  as  a  new  simplification  rule.  It  says  that  if  two 
standard  names  are  the  same,  the  objects  which  they  refer  to  must  also  be  the  same.  This  is 
actually  a  logical  consequence  of  LlOa  and  is  not  strictly  necessary,  but  it  will  simplify 
certain  proofs  to  have  it  explicitly  asserted. 

Kl.  T(wj,KriOw(trm.aj,Pj))  <■>  Vw2(K(D(wj ,trm.aj ),wj ,w2)  •>  T(w2,Pi» 

K2.  K(a1(W|,wj) 

K3.  K(aj,wj,w2)/[w]  /  w2]  ->  (KUj^.Wj)/^  /  wj]  ->  M*|,wj,w3)) 

Axiom  K2  is  also  very  simple,  being  an  atomic  assertion,  but  Kl  and  K3  are  more 
complicated.  Like  LI  -  LI 3,  Kl  translates  from  the  object  language  into  the  meta-language, 
but  the  meta-language  side  contains  an  implication  for  which  we  have  to  choose  a 
procedural  interpretation.  The  interpretation  we  have  chosen  is  to  assert  that  everything 
that  John  knows  in  Wj  is  true  in  W2,  for  any  world  W2  such  that  K(A,W|,W2)  is  asserted. 
Furthermore,  the  implications  in  K3  have  been  interpreted  in  a  way  that  promotes  the 
principle  that  whenever  a  formula  of  the  form  K(A,Wj,W2>  is  true,  it  should  be  explicitly 

asserted. 

The  reason  for  these  decisions  is  the  fact  that  otherwise,  forward  chaining  in  the  context 


of  someone's  knowledge  will  not  work.  In  section  6.3  we  cited  several  cases  where  efficient 
reasoning  about  ordinary,  non-modal  concepts  requires  forward  chaining.  Suppose  (P  a  Q) 
is  such  a  case.  That  means  that  given  (P  =>  Q)  and  P  as  assertions  and  Q  as  a  goal,  we 
should  proceed  by  reasoning  forward  from  P  to  Q  rather  than  backwards  from  Q  to  P.  We 
might  choose  to  represent  (P  a  Q)  as  (P  ■>  Q). 

Suppose  that  this  reasoning  was  embedded  in  a  knowledge  context;  e.g. 
T (Wq, Know ( John, (P  ■>  Q)))).  Presumably,  we  still  want  (P  ■>  Q)  to  function  as  a  forward¬ 
chaining  rule,  so  the  meta-language  expression  of  John's  knowledge  would  be  a  forward¬ 
chaining  rule  that  asserts  H(Wi,K})  for  any  world  Wj  for  which  H(Wj,:P)  is  asserted, 
provided  K(:John,W0,Wj)  is  true.  Now  the  meta  language  translation  of  T(W0,Know(John,(P  ■> 
Q))»  would  be  K(:John,W0,W| )  ->  T(wj,(P  »>  Q).  Suppose  that  K(:John,W0lW| )  is  asserted. 
This  will  result  in  T(Wj,(P  ■>  Q))  being  asserted,  which  will  be  transformed  by  the  L  rules 
into  H(W |  ,:P)  ■>  H(W2,:Q).  which  is  exactly  what  we  want.  If  we  had  chosen  to  represent  the 
meta-language  translation  of  T(WQ,Know(John,(P  ■>  Q»)  as  a  backward  chaining  rule,  the 
formula  H(Wj,:P)  ■>  H(W2.:Q)  would  not  have  been  explicitly  asserted,  and  so,  would  not 

function  as  a  forward-chaining  rule.  Therefore,  the  right  side  of  Kl  needs  to  be  a  forward 
chaining  rule.  Furthermore,  K(:John,W0,Wj)  also  had  to  be  explicitly  asserted;  for  it  to  be 
merely  derivable  would  not  have  been  enough.  Therefore,  we  will  want  facts  like 
K(:John,WQ,W()  to  be  asserted  whenever  possible,  so  rules  like  K3  will  always  be  expressed  in 
forward-chaining  form. 

Another  point  about  Kl  and  K3  is  that  the  procedural  interpretation  we  have  chosen 
for  the  implications  in  those  axioms  ignores  the  contrapositive  form  of  the  axioms.  The 
most  natural  contrapositive  of  a  forward-chaining  rule  which  triggers  on  (Ks),wj,w2)  would 
be  a  backward-chaining  rule  for  showing  'K(iilwj,w2).  For  instance,  K3  could  give  rise  to 


a  rule  which  says  to  prove  something  of  the  form  -Kfoj.wj^)  try  proving  KUj.w2.w3)  and 
-KU1.wj.w3).  The  reason  that  we  ignore  the  contrapositive  form  of  assertions  like  K3  or 
the  right  hand  side  of  K 1  is  that  we  can  structure  our  system  so  that  assertions  and  goals  of 
the  form  -KU1.wj.w2)  do  not  occur. 

To  see  this,  note  that  the  only  axiom  which  translates  from  the  object  language  to  the 
metalanguage  and  produces  formulas  containing  the  predicate  K  is  Kl.  If  we  assert 
T(W j  ,Know(A,P»  we  obviously  do  not  generate  any  formulas  containing  anything  of  the  form 

-KUj.wj.w2).  If  we  have  a  goal  of  the  form  T(Wj,Know(A,P)),  we  would  generate  a  subgoat 
of  the  form  Vw2(K(iA.Wi,w2)  ->  T(w2,P)).  This  would  be  Skotemlzed  to  something  like 
(K(:A,Wj,W2)  ->  T(W 2,P»,  which  would  be  proved  by  natural  deduction,  asserting 
K(:A,Wj,W2)  and  deriving  (7(W2,P). 

Conversely,  asserting  T{Wj,Noi(K«ow(A.P)))  would  result  in  asserting  something  like 
K{:A,Wj,W2)  and  -T(W2,P},  and  trying  to  show  T(Wj,Not(Know{A,P»)  would  generate  the 
subgoal  (K(:A,Wj,w2)  a  -T(w 2,P)). 

In  neither  case  is  anything  of  the  form  -KUj,Wj,w2)  generated.  So  if  all  problem 
descriptions  are  stated  in  the  object  language,  they  will  not  create  anything  of  the  form 
-KUj.wj,w2).  Furthermore,  it  is  quite  easy  to  structure  the  other  axioms  so  that  they  do  not 
introduce  any  formulas  of  that  form.  So  whenever  we  have  a  forward-chaining  rule  that 
triggers  on  KUi,W|.w2),  it  will  not  be  necessary  to  have  the  contrapositive  rule.  One  way  of 

viewing  this  fact  is  to  note  that  the  meta-language  of  our  formalism  is  in  some  ways  richer 
tM.n  the  object  language,  but  we  do  not  need  to  make  use  of  all  of  that  richness. 

A  final  point  about  K3  is  that  it  includes  syntactic  restrictions  on  its  application.  These 
syntactic  restrictions  are  included  because  having  K2  around  makes  it  important  to  check 
whether  a  rule  produces  useful  information  if  its  input  is  Kfaj.wj.wj).  If  we  did  not  place 
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these  restrictions  on  K3,  it  would  combine  with  K2  to  produce  the  tautologous  assertion 
(K{«|,W|,W3>  ->  K(»j,w2,w3)).  Moreover,  even  if  wj  and  w2  are  distinct,  the  inner  rule  of 
K3  requires  w2  and  w3  to  be  distinct  to  avoid  asserting  KUj.wj.wj),  which  would  simply 
repeat  the  initial  pattern  which  triggered  the  inference. 

6.6  Some  Examples 

We  are  now  in  a  position  to  work  out  some  examples  of  automatically  generated 
deductions  about  knowledge.  It  should  be  noted  that  no  program  has  been  written  to 
produce  these  deductions,  so  our  examples  are  subject  to  all  the  possible  errors  and  omisions 
of  hand  simulations. 

Our  "automatically"  generated  proofs  will  be  produced  by  applying  all  applicable 
inference  rules  in  a  depth-first  fashion.  The  order  of  application  of  the  rules  will  be  to  first 
apply  any  rules  which  replace  the  current  expression  by  another  expression,  then  if  no 
further  rules  of  that  type  apply,  to  try  any  other  rule.  Within  these  two  groups  of  rules  we 
will  follow  the  order  that  they  are  presented  in  the  text.  Rules  which  involve  a  second 
formula  will  take  those  formulas  in  the  order  they  appear  in  the  prook.  The  point  is  to 
show  that  the  search  space  is  so  tightly  controlled  that  a  fixed  search  strategy  produces 
satisfactory  results. 

In  respect  to  the  form  of  proofs,  indentations  wilt  indicate  the  tree  structure  of  the  proof, 
with  each  indented  line  being  directly  derived  from  the  most  recent  line  at  the  next  higher 
level.  These  proofs  will  intermix  assertions  and  goals.  Coats  will  be  distinguished  by  being 
prefixed  with  a  *.  Formulas  which  are  deleted  as  they  are  generated,  whether  by 
application  of  a  deletion  rule  or  by  replacement  by  another  formula,  will  be  prefixed  with  a 
•.  A  solved  goal  will  be  indicated  by  *T. 

Since  the  structure  of  the  proof  indicates  which  preceding  line  each  line  is  immediately 
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derived  from,  the  Justification  column  gives  only  the  additional  lines  or  axioms  used  for 
that  step.  We  will  suppress  much  of  the  detail  of  translating  from  the  object  language  to 
the  meta-language.  Whenever  two  or  more  consecutive  steps  involve  rules  LI  -  LIS,  we  will 
combine  them  into  a  single  step  and  simply  give  L  as  the  Justification. 

The  special  justification  notations  Ant*  and  Com  indicate  the  assertion  antecedent  and 
the  goal  consequent  of  an  implication  being  proved  by  natural  deduction.  Subgoals 
generated  by  splitting  a  conjunctive  goal  will  be  indicated  by  the  notation  Split  The 
notation  Eq  indicates  the  application  of  one  of  the  simplification  rules  for  equality. 

One  simple  example  is  to  show  that  if  A  knows  that  P  implies  Q  then,  if  A  knows  that  P, 
then  A  knows  that  Q. 

Prove:  True(Know(A,(P  ■>  Q))  ■>  (Know(A,P)  ■>  Know(A,Q))) 


1 .  **Truo(Know(A,(P  ->  Q))  ■>  (Know(A,P)  ■>  Know(A,Q))J  Goal 

2.  **T(W0,Know(Af<P  «>  Q)))  a  T(W0,{Know(A ,P)  ->  Know(A,Q)))  L 

3.  •T<W0.Know(A,(P  ->  Q»)  Ante 

4.  K(:AiWq,W|)  ->  T(wj,(P  ■>  Q)}  K1 

5.  «T(W0,(P  ->  0))  K2 

6.  H(W0,:PJ  •>  T(W0,Q)  L 

7.  •*T(W0,(Know(AlP)  ■>  Know(A.Q)))  Con* 

8.  •*T(W0,Know{A,P)J  •>  T(W0,Know(A,Q))  L 

9.  •T(W0,Know(A,P»  Anto 

10.  K(:A,W0,w,)«>  T(wj,P)  K1 

11.  •T<W0.P}  « 

12.  H(W0,:P)  L 

13.  •KWq.Q)  6 

14.  H(W0,!Q)  L 

15.  •*T(W0,Know(A,Q))  Con* 

1 6.  o*K(:A,W0,W, )  ->  T(Wj  ,Q)  K1 

17.  K(:A,Wq,Wj)  Ant* 

1*.  KlsA.W^wgJ/lW,  t  w3]  ->  K(:A,W0,w3»  K3 

19.  *T(W|,(P  «>  0))  4 

20.  H(W,,jP)->T(W,,0)  L 

21.  •T(W|,P)  10 

22.  H(W|,tP)  L 

23.  •T(W,,Q)  20 


24. 

H(W„sQ) 

L 

25. 

•*T(W|,Q) 

Com 

26. 

*H(W,^Q) 

Com 

27. 

*T 

24 

Line  1  is  the  statement  of  the  problem  in  the  object-language.  Line  2  translates  this  into 
a  meta-language  implication  to  be  proved  by  natural  deduction.  Line  3  asserts  the 
antecedent,  that  A  knows  that  P  implies  Q.  Line  4  translates  t.iis  into  meta-language  terms, 
saying  that  P  implies  Q  in  every  world  which  is  compatible  with  what  A  knows  in  Wq.  We 

are  treating  A  as  a  rigid  designator  to  make  the  formulas  simpler,  although  this  does  not 
affect  the  length  of  the  proof.  Since  every  world  is  compatible  with  what  anyone  knows  in 
that  world,  P  implies  Q  in  Wq  (lines  5  •  6).  We  now  try  to  prove  the  consequent  of  line  2, 

that  if  A  knows  that  P,  then  A  knows  that  Q  (line  7).  This  translates  into  a  meta-language 
implication  (line  8).  so  the  antecedent  is  asserted  (line  9),  which  translates  into  the  assertion 
that  P  is  true  in  every  world  which  is  compatible  with  what  A  knows  in  W0  (line  10).  This 

implies  that  P  is  true  in  Wq  (lines  1 1  •  12).  and  hence,  that  Q  is  true  in  Wq  (line  13  •  14). 

We  now  try  to  prove  the  consequent  of  line  8,  that  A  knows  that  Q  (line  15),  which 
translates  into  the  goal  of  proving  that  Q  is  true  in  every  world  compatible  with  what  A 
knows  in  Wq.  The  quantifier  in  this  goal  is  removed  by  Skolemization,  so  we  try  to  prove 
that  if  Wj  is  a  typical  world  which  is  possible  according  to  what  A  knows  in  Wq,  then  Q  is 
true  in  Wj  (line  16).  To  prove  this  implication  we  assert  the  antecedent,  that  Wj  is  possible 
according  to  what  A  knows  in  Wq  (line  17),  which  triggers  an  application  of  K3  (line  18), 
and  also  implies  that  P  implies  Q  in  Wj  (lines  19  -  20),  and  that  P  is  true  in  Wj  (lines  21  • 
22),  hence  Q  is  true  in  Wj  (lines  23  •  24).  We  now  try  to  prove  the  consequent  of  line  14, 
that  Q  is  true  in  Wj  (lines  25  •  26).  This  immediately  succeeds  (line  27),  completing  the 


It  is  worth  making  a  few  comments  about  this  proof.  First  of  all,  not  every  formula 
generated  was  needed  for  the  proof.  In  fact,  none  of  the  inferences  that  depended  on 
axioms  K2  or  K3  were  used.  These  inferences,  however,  accounted  for  only  seven  of  the 
twenty-seven  lines  generated.  Perhaps  more  significant  is  the  low  branching  factor  of  the 
proof  tree.  For  the  non-terminal  nodes  of  the  tree  (i.e.  those  formulas  which  generated  at 
least  one  other  formula)  the  average  number  of  branches  was  less  than  1.5.  This  is  reflected 
in  the  fact  that  of  the  twenty-seven  formulas  generated,  fifteen  were  immediately  replaced 
by  other  formulas  and  deleted.  These  are  cases  where  the  knowledge  that  there  is  only  one 
reasonable  inference  to  make  from  a  formula  is  embedded  in  the  rules  of  inference  and 
axioms  of  the  system. 

Finally  it  should  be  emphasized  that  these  are  the  only  inferences  that  can  be  made 
from  the  initial  problem  statement,  given  the  way  the  axioms  are  structured.  If  we  modified 
the  problem  slightly,  so  that  the  final  goal  were  to  prove  that  Q  is  not  true  in  W|,  there 
would  still  only  be  about  thirty  formulas  generated,  even  though  the  proof  would  fail.  This 
would  definitely  not  be  the  case  if  we  turned  a  standard  theorem  prover  loose  on  the  purely 
logical  version  of  the  formalism  given  in  chapter  4.  There  are  many  possibilities  for 
infinite  search  paths  through  these  axioms,  such  as  trying  to  prove  T(W|,Q)  by  proving 
T(Wj,And{Q,pj))  or  by  the  meta-language  equivalent  of  trying  to  prove  that  A  knows  that  Q 
by  proving  that  A  knows  that  he  knows  that  Q.  (Both  of  these  approaches  obviously  recurse 
infinitely.)  If  we  tried  to  prove  anything  that  did  not  follow  from  the  premises  of  the 
problem,  a  typical  theorem-proving  algorithm  would  never  terminate.  It  is  the  care  taken  in 
structuring  the  domain-defining  axioms  that  makes  our  system  tightly  controlled. 

As  a  second  example,  we  will  show  an  algorithmically  generated  proof  that  if  A  knows 
who  B  is  and  A  knows  who  C  is,  then  A  knows  whether  B  equals  C. 
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Given:  Tru«(Exi*t(?X j,Know(A,Eq(8,TXj )))) 
True(Exist(TX ,  ,Know(A,Eq(C,rX , )))) 


Prove:  True(And(Eq(B,C)  ->  Know(A,Eq(B,C)))1{Not(Eq(B,C))  ■>  Know(A,No»(Eq(B,C)»)) 


1 .  eTrue  (Exist  (?X ,  ,Know(A,Eq(B,TX , ))))  Given 

2.  K(:A,W0,w,)  ->  T(w|,Eq(B,e(:B’))  L,K1 

3.  •T(W0,Eq(B,a(:B’)»  K2 

4.  V(W0,:B)  -  :B*  L 

5.  «True(Exist(?X j  ,Know(A,Eq(C,TX j ))})  Given 

6.  K(:A,Wq,w j  )  ->  T(w,  ,Eq(C,®(:C'»  L,Kl 

7.  •T(W0,Eq(C,O{:C,»)  K2 

8.  V(W0,:C)  •  :C’  L 


Lines  1  •  8  give  the  premises  of  the  problem  and  the  forward  deductions  made  from 
them.  Line  I  is  the  first  premise,  and  line  2  is  its  translation  into  the  meta-language.  We 
have  combined  several  applications  of  L  rules  and  an  application  of  K I  into  a  single  step. 
The  ob ject-language  premise  says  that  there  is  some  individual  which  A  knows  to  be  named 
by  B,  which  translates  into  the  meta-language  assertion  that  there  is  some  individual 
(represented  by  the  Skolem  constant  :B’)  which  is  the  denotation  of  B  in  every  world  which 
is  compatible  with  what  A  knows  in  Wq.  Since  Wq  itself  is  one  of  those  worlds,  :B'  is  the 

denotation  of  B  in  W0  (lines  3  •  4).  Lines  5  -  8  make  the  analogous  inferences  for  the 
premise  that  A  knows  who  C  is.  As  in  the  previous  example,  we  are  assuming  that  A  is  a 
rigid  designator  to  simplify  the  meta-language  formulas,  without  affecting  the  length  of  the 
proof.  B  and  C,  of  course,  must  not  be  rigid  designators  to  avoid  trivializing  the  proof. 


9.  •*True(And(Eq(B,C)  ■>  Know(A,Eq(B,C))),  Goel 

(Not(Eq(B,C))  •>  Know(A,Not(Eq(B,C)))» 

10.  •*T(W0,(Eq(B,C)  ■>  Know(A,Eq(B,C))))  a  l 

T(W0,(Not(Eq(B,C))  ->  Know{A,Not(Eq{B,C)»» 

11.  •*T(W0,(Eq(B,C)  ■>  Know(A,Eq(B,C))))  Split 

12.  •*T(W0,Eq(B,C)) ->  T(W0,Know(A,Eq(B,C)))  l 

13.  aT(W0,Eq(B,C))  Ante 

14.  «V(W0,sB)  -  V(W0,»C)  l 

15.  etB*  ■  V(Wq,:C)  4 
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l«.  t§’  ■  tC*  t 

17.  V(We^B)->C  4 


Now  we  try  to  prove  the  goal,  that  If  B  and  C  are  the  same  Individual,  A  knows  that 
they  are  the  same  individual,  and  if  they  are  not  the  same  individual,  he  knows  that  they 
are  not.  Line  9  states  the  goal  in  the  object  language,  and  line  10  transforms  it  into  a  meta¬ 
language  conjunction  to  be  solved  by  splitting.  Line  1 1  states  the  first  conjunct  of  the  split, 
and  line  12  converts  it  into  a  meta-language  implication,  to  be  proved  by  natural  deduction. 
Line  1$  asserts  the  antecedent,  which  translates  into  the  meta-language  statement  that  the 
denotation  of  B  in  W0  is  the  same  as  the  denotation  of  C  in  Wq  (line  14).  Since  the 
denotation  of  B  is  tB’  and  the  denotation  of  C  is  tC',  it  follows  that  tB'  is  the  same  as  iC' 
(lines  15  •  16).  Making  this  inference  causes  the  the  instance  of  :B’  in  line  4  to  be  replaced 
by  tC'.  From  this  point  on  in  this  branch  of  the  split,  line  4  is  deleted. 


18.  •*T(W0,Know(A,Eq(B,C)»  Cons 

19.  ■*K(:A,W0lW,>.>T(W,^(B^))  L.KI 

20.  K(<A,W0,W|)  Ante 

2 1 .  K(:A,W,  ,w3)/(W,  /  w3]  «>  K(iA,W0,w3»  K3 

22.  aT(W|  ,£q(8,fi(:B’)))  2 

23.  •V(W|,:B)«:B'  L 

24.  V(Wj,:B)  ■  iC'  16 

25.  «T(Wj  ,Eq(C,o(:C’)))  6 

26.  V(Wlf:C)  ■  sC'  L 

27.  •*T(W,,Eq(B,C))  Conte 

2«.  o*V(Wj,sB)  ■  V(Wj,sC)  L 

29.  «*:C'-V(W  psC)  24 

30.  t*:C'  >  tC ’  26 

31.  *T  Eq 


Line  18  makes  the  consequent  of  line  12  into  the  goat  of  showing  that  A  knows  that  B 
and  C  are  the  same  individual.  This  translates  into  the  meta-language  goal  of  showing  that 
in  every  world  (represented  by  the  Skolem  constant  Wj)  which  is  compatible  with  what  A 
knows  in  W0,  B  and  C  refer  to  the  same  individual  (line  19).  This  is  itself  an  implication  to 


be  attacked  using  natural  deduction,  so  we  assert  the  antecedent  (line  20),  which  triggers  K3 
(line  21).  The  fact  that  Wj  is  one  jf  the  worlds  which  are  compatible  with  what  A  knows  in 

W0  triggers  lines  2  and  6  to  assert  that  the  denotations  of  B  and  C  in  Wj  are  »B'  and  tC', 
respectively  (lines  23  and  26).  The  occurrence  of  :B’  in  line  23  is  replaced  by  tC'  (line  24). 
We  now  try  to  prove  the  antecedent  of  line  19,  by  showing  that  the  denotation  of  B  in  Wj 
is  the  same  as  the  denotation  of  C  in  Wj  (lines  27  -  28).  Since  the  denotations  of  B  and  C 
are  both  :C',  this  goal  is  transformed  into  the  goal  of  showing  that  tC’  is  the  same  as  :C’ 
(line  30),  which  is  Immediately  satisfied,  completing  the  proof  of  the  first  branch  of  the 
split  (line  31). 


32. 

a«T(W0,(Not(Eq(B,C))  •>  Know(A,Not(Eq(B,C))))) 

Split 

33. 

•*T (Wg.Not (Eq(B,C)»  ->  T<W0,Know(A,N©t{Eq(B,C))>) 

L 

34. 

■T(W0,Not(Eq(B,C))) 

Ante 

35. 

•V(W0,:B)  /  V(W0,:C) 

L 

36. 

•:B’  /  V(Wq,:C) 
iB’  i  :C* 

4 

37. 

8 

38. 

■*T  (W0,Know(A,Not  (Eq(B,C)))) 

Cons 

39. 

•*K(:A,W0lW, )  ->  T(Wj  ,Not(Eq(B,C))) 

L,K1 

40. 

K(:A,Wq,W|) 

Ante 

41. 

K(:A,W,iw3)/[W1  /  w3J  ->  K(^,W0,w3» 

K3 

42. 

•T  (W  |  ,Eq(B,fi(:B'))) 

2 

43. 

V(W|,:B)  ■  :B’ 

L 

44. 

•T  (W  j  ,Eq(C,fi(:C'))) 

6 

45. 

V(W,,:C)  ■  sC' 

L 

46. 

•*T(W|  ,Noi(Eq(B,C))) 

Cons* 

47. 

•*V(Wj,sB)  f  V(W,,:C) 

L 

48. 

•*:B’  /  V(Wj,:C) 

43 

49. 

*tB'  /  jC' 

45 

50. 

*T 

37 

The  proof  of  the  second  branch  of  the  split  is  quite  similar  to  that  of  the  first  We 
begin  with  the  goal  of  showing  that  if  B  and  C  are  not  the  same,  A  knows  that  they  are  not 
the  same  (lines  32  •  33).  This  is  an  implication,  so  we  assert  the  antecedent  (line  34),  which 
is  translated  into  the  meta-language  assertion  that  the  denotation  of  B  in  Wq  is  not  the 


same  as  the  denotation  of  C  in  Wq  (line  35).  Since  we  know  from  lines  4  and  8  that  these 

denotations  are  :B'  and  sC\  respectively,  we  infer  that  :B'  and  tC*  are  not  the  same  (lines  36  • 
37). 


Now  we  try  to  prove  the  consequent  of  line  33,  showing  that  A  knows  that  B  and  C  are 
not  the  same  individual  (line  38).  This  translates  into  the  meta-language  goal  of  showing 
that  in  every  world  (represented  by  the  Skolem  constant  Wj)  which  is  compatible  with  what 
A  knows  in  Wq,  B  is  not  the  same  individual  as  C  (line  3d).  We  assert  that  Wj  is  compatible 
with  what  A  knows  in  Wq  (line  40),  which  triggers  K3  (line  41),  and  implies  that  the 
denotations  of  B  and  C  in  W|  are  iB*  and  sG\  respectively  (lines  42  -  45).  We  now  try  to 
prove  that  the  denotations  of  B  and  C  in  Wj  are  not  the  same  (lines  46  -  47).  Lines  43  and 

45  transform  this  into  the  goal  of  showing  that  tB*  and  £'  are  not  the  same  (lines  48  -  49). 
This  matches  the  assertion  on  line  37,  so  we  are  done  (line  50).  This  completes  both 
branches  resulting  from  spitting  line  10,  so  the  entire  proof  is  complete. 
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7.  Automating  Deduction*  about  Knowledge  and  Action 

7.1  Interpreting  Axiom*  for  Knowlege  and  Action 

In  this  chapter,  we  deal  with  more  complex  problems  of  algorithmically  generating 
deductions  involving  both  knowledge  and  action.  In  the  previous  chapter,  the  examples  of 
reasoning  about  knowledge  alone  involved  only  one  or  two  possible  worlds.  The  problem 
was  simply  to  set  up  a  typical  world  which  is  compatible  with  what  someone  knows,  and  to 
do  a  simple  deduction  relative  to  that  world.  In  reasoning  about  both  knowledge  and 
action,  however,  we  will  be  dealing  with  fairly  complicated  structures  of  several  possible 
worlds.  Managing  the  flow  of  information  among  these  possible  worlds  is  a  major  problem. 
The  axioms  relating  to  action  and  its  interaction  with  knowledge  must  be  structured  to 
manage  this  information  flow  in  an  efficient  manner.  These  issues  can  best  be  explored  by 
examining  the  axioms  involved.  It  may  be  helpful  to  refer  to  appendix  A  to  compare  the 
procedural  versions  of  these  axioms  with  the  purely  logical  versions. 

Rl.  T(wj,R«s(trm.«V|,p|))/ 

[(trm.avj  /  DoOrnvij.Orm.*^;  irm.actj)))  a 
(trm.avj  /  DoOrm.aj.lf^.irm.a^.trm.actj)))  A 
(trm.avj  /  Do(trm.a j  ,Whila(p2,irm.act2))]  <■> 

3w2 (R(D(w |  ,Do(trm.a  j  ,t rm.act  j  )),w |  ,w2 )  A  T(w2,Pi)) 

R2.  T(w|,Ras(Do(trm.a|,(trm.act|;  trm.a^D.pj))  <■> 

T  (w  j  ,Ras<Oo(trm.ij  ,irm.act  j  ),R«s(Do(fi(D(w  j  ,trm.aj  )),trm.aet2),p  j ))) 

R3.  T(W|,R«c(Oo(trm.a|,lf(p|,(rm.act|lirm.Kt2)),P2))  <*> 

((T(wj,p|)  a  T(wj ,Rat(Do(trm.a j .trm.actj ),p2)))  v 
('T(wj,pj)  A  T(W|,Ras(Oo(trm.a|ltrm.Kt2)iP2)))) 

R4.  T(wj,R«t(0o(trm.a|lWhil«(p|ltrm.act|)),P2))  <*> 

T(w  j  ,Ra*(Do(trm.a  j  ,H(p  j  ,(trm.act|  t  Whilaipj  ,trm.act  j  »,Nil))^2)> 

R5.  R(0o(lrm.aj,Nil),wj,W2)  <■>  (wj  ■  W2> 


Res  is  the  basic  object-language  predicate  for  talking  about  the  results  of  actions. 
Recall  from  chapter  3  that  T(Wj,R«i(Ev,P))  means  that  in  the  world  W|  it  is  possible  for  the 
event  Ev  to  occur  and  that  in  the  resulting  situation/ world  P  is  true.  R I  is  a  translation  rule 
from  the  object  language  to  the  meta-language  that  embodies  this  definition.  The  syntactic 
restrictions  in  R 1  prevent  its  application  to  events  that  are  described  as  complex  sequences 
of  actions.  We  use  syntactic  restrictions  here  because,  although  Rl  is  true  for  complex 
sequences,  heuristically  we  want  to  use  R2  -  R4  instead  if  they  are  applicable.  The  syntactic 
restrictions  on  R I  simply  rule  it  out  in  cases  where  R2  -  R4  apply.  R2  is  an  expansion  rule 
for  object-language  expressions  which  transforms  a  formula  which  talks  about  the  results  of 
a  sequence  of  actions  into  a  formula  which  talks  about  doing  the  first  action  in  the 
sequence,  and  then  doing  the  rest.  If  the  first  action  in  the  sequence  is  a  simple  action,  Rl 
can  the  be  applied.  Otherwise,  the  decomposition  of  the  complex  action  continues.  R3  *  R5 
describe  similar  decompositions  for  conditionals  and  loops. 

In  addtion  to  Ret,  we  also  have  the  weaker  operator  Retl.  Recall  that  the  difference 
between  Ret  and  Retl  is  that  Retl  assumes  that  the  event  is  possible,  rather  than  asserting 
that  it  is.  That  is,  Resl(Ev,P)  means  that  if  Ev  were  to  happen,  P  would  be  true  in  the 
resulting  situation,  while  Res(Ev,P)  makes  the  additional  assertion  that  it  is  possible  for  Ev  to 
happen.  The  procedural  version  of  the  axiom  which  defines  Retl  for  one-step  actions  is  as 
follows: 

R6.  T(w|,Retl (trm.evj ,p j ))/ 

[(trm.evj  /  DoOrm.apOrm.actg;  Irm.tctj)))  A 
(trm.evj  /  Do(trm.aj,ll(p2,trrTV*ct2,trm.»ct3)))  A 
(trm.eV|  /  Do(lrm.»j, While  <*> 

Vw2(R(0(w]l0o(trm.t|ltrm.aci]l}lW]lW2)  ->  T(w2,pj)) 

Proving  a  goal  involving  Retl  will  be  the  same  as  proving  a  goal  involving  Ret,  except  that 
we  will  assert  that  there  is  a  situation  which  is  the  result  of  the  event  happening,  instead  of 
proving  that  there  is  such  a  situation. 


Cl.  T(wj  ,Can(tr  m.a  j  .trm.act  j  ,p  j ))/ 

[(trm.aetj  /  (trm.aetj}  trm.aetj))  A 
(trm.aetj  /  If  (pj, trm.aetj, trm.act3))  a 
(trm.acl]  /  Whila(pj, trm.aetj))]  <■ 

T  (w  |  .Know  (Irma  j  ,Arvd(Eq(e(D(w  j  .trm.act  j )), trm.aetj ), 

Ras(Do(a(D(wj  .trm.aj  )>, trm.aetj  ),pj ))))] 

C2.  T{wj,Can(trm.aj, (trm.aetj;  trm.aetj), pj ))  <*> 

T (w  j  ,Can(trm.aj  .trm.aetj  ,Can(fi(0(wj  .trm.aj  )),trm.#etj,pi ))) 

C3.  T(wj,Can(trm.aj,U(pj, trm.aetj, trm.aetj), pj))  <B> 

((T(wj,pj)  A  T(wj,Can(trm.aj, trm.act j,pj)))  v 
(-T(wj.pj)  A  T(w j ,Can(trm.aj , trm.aetj, pj)))) 

C4.  T(wj,Can(trm.ajlWhila(pj,tr.aet|),P2))  <■> 

T(wj  .Can (trm.aj  ,l((pj  .(trm.aetj ;  Whila(pj  .tract  j )), Nil), pj)) 

The  object  language  operator  Can  describes  the  ability  of  an  agent  to  obtain  a  result  by 
performing  a  given  action.  In  essence,  T(Wj,Can(A,Act,P)  means  that  A  knows  how  to 
achieve  P  by  doing  Aet.  Cl  says  that  T(Wj,Can(A,Act,P)  is  true  if  A  knows  what  action  Act 

describes,  and  knows  that  his  doing  Aet  will  bring  about  P.  Like  Rl.  Cl  is  restricted  to 
apply  only  to  simple  actions,  but  for  a  somewhat  different  reason.  The  trouble  with 
applying  C 1  to  a  complex  action  is  that  it  imposes  too  strong  a  requirement  on  whether  an 
agent  can  carry  out  the  action.  Recall  that  in  chapter  3,  we  said  that  knowing  what  action 
Aet  describes  amounts  to  knowing  exactly  how  to  carry  out  Aet,  But  if  Aet  describes  a 
sequence  of  actions,  it  need  not  exactly  specify  every  step  of  the  sequence  in  order  for  an 
agent  to  be  able  to  carry  out  the  sequence.  The  first  step  must  be  specified  exactly,  but  for 
the  remaining  steps  described  by  Aet,  it  is  only  necessary  that  the  agent  know  that  at  each 
step  he  will  know  what  to  do  next.  This  allows  for  steps  during  which  the  agent  acquires 
information  about  what  to  do.  This  idea  is  expressed  by  the  expansion  rule  C2,  with  C3 
and  C4  integrating  loops  and  conditionals  into  this  structure. 

This  this  finishes  most  of  the  very  general  axioms;  the  remainder  are  about  specific 
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actions  or  predicates.  Since  all  our  examples  involving  both  knowledge  and  action  deal 
with  opening  safes,  we  will  look  at  the  axioms  for  Dial  next 

Dla.  R(:Do(ij,:Di«l(X|P«2»iwliw2)  ■> 

□w3(V(w3,:Comb(x2))  »  Xj)  a  :S«fe(x2)  a  H(wj,:At(«j p(2))> 

Dlb.  R(:Do(a j ,:Oial(X| ^D.wj ,F j (*i  ,*(  ,*2>wl ))  <• 

((V(w3,:Comb(*2>  »  Xj)  A  :S«fe(x2)  A  H(wj,:Ai(»j^2^) 

Dla  and  Dlb  describe  the  circumstances  under  which  it  is  possible  to  perform  a  dialing 
action.  The  thing  being  dialed  must  be  a  combination,  the  thing  it  is  dialed  on  must  be  a 
safe,  and  the  agent  must  be  at  the  same  place  as  the  safe.  We  have  split  axiom  Dl  into  two 
parts  so  that  the  existential  quantifier  could  be  removed  from  the  left  side.  Removal  of  the 
quantifier  is  necessary  for  the  unification-based  matching  routine  to  work  properly.  The 
biconditional  in  Dl  had  to  be  broken  apart  because  the  quantifier  in  a  formula  of  the  form 
Qx(P)  ■>  0)  is  Skolemized  differently  than  the  quantifier  in  px(P)  <■  Q). 

02.  R(jDo(*j  .sOiaKxj  ^2)),wj  ,w2)  -> 

((H(w2,int.p| )  <■> 

(((int.pj  ■  :0p«n(x2))  A 

((V(W|,:Comb(x2))  ■  Xj)  v  H(wj,sOp«n(x2))))  V 
({int  pj  /  :0p«n(x2))  A  H(w|,int.pj))))  A 
(V(w2,int.trmj)  •  V(wj,int.trmj ))) 

D2  is  significantly  more  complex  than  the  preceding  axioms  and  deserves  special 
attention.  D2  describes  the  total  physical  effects  of  dialing  and  incorporates  the  previous 
frame  axiom  D4.  It  says  that  in  the  situation/worid  resulting  from  dialing  the  combination 
xj  on  the  safe  x2,  any  proposition  is  true  just  in  case  the  proposition  is  that  the  safe  is  open, 
and  either  the  combination  dialed  was  the  combination  of  the  safe  or  the  safe  was  already 
open,  or  the  proposition  is  something  other  than  that  the  safe  is  open  and  the  propostion 
was  true  before  the  dialing  took  place.  Also,  any  term  refers  to  the  same  object  in  the  new 
situation  as  it  did  in  the  preceding  situation. 
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D2  is  structured  so  that  ail  assertions  and  goals  "flow"  backwards  in  time  from  the  new 
situation  to  the  old  situation.  This  incorporates  a  solution  to  the  frame  problem  which  has 
been  advocated  by  many  authors,  including  Kowalski  (1974),  Hewitt  (1975),  and  Waldinger 
(1975).  That  method  is:  To  decide  whether  a  proposition  is  true  in  the  situation  resulting 
from  performing  an  action,  first  see  whether  the  action  made  the  proposition  true  or  false 
and  report  success  or  failure  accordingly,  and  if  the  action  did  not  affect  the  proposition,  see 
whether  the  action  was  true  in  the  situation  prior  to  the  action.  If  there  is  a  sequence  of 
situations  leading  to  the  situation  we  are  interested  in,  the  procedure  is  recursive.  D2 
implements  this  procedure  for  physical  propositions,  because  whether  the  safe  is  open  is  the 
only  physical  condition  affected  by  dialing. 

D2  also  takes  in  to  account  another  possibility  ignored  by  most  systems.  That  possibility 
is  that  we  may  be  told  something  about  a  situation  that  implies  something  about  a 
preceding  situation.  For  instance,  if  we  are  told  that  the  safe  is  not  open  after  the  dialing 
action,  this  implies  that  the  safe  was  not  open  before  the  dialing  action  either.  D2  handles 
this  inference,  since  its  output  functions  as  a  forward-chaining  rule  which,  for  any  fact 
which  is  asserted  about  the  situation  resulting  from  the  action,  asserts  the  information  it 
provides  about  the  preceding  situation. 

Another  significant  fact  about  D2  is  that  it  is  a  forward-chaining  rather  than  a 
backward-chaining  rule.  If  we  are  trying  to  verify  the  effects  of  a  given  action,  we  clearly 
want  to  confine  our  attention  to  assertions  about  that  action.  If  we  turned  D2  around  and 
made  it  a  backward-chaining  rule,  the  test  whether  the  action  involved  is  the  one  we  are 
interested  in  would  be  the  last  thing  checked.  If  we  had  axioms  describing  many  other 
actions,  the  system  could  do  a  lot  of  useless  search  before  finding  the  action  it  needed.  On 
the  other  hand,  if  we  were  doing  plan  generation,  we  would  be  looking  primarily  for  a 
specific  result,  and  be  willing  to  take  any  action  that  provided  it  to  us,  so  a  backward- 
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chaining  rule  would  be  appropriate.  We  would  probably  want  to  have  two  logically 
equivalent  rules,  with  restrictions  on  the  input  variables  to  distinguish  them.  If  the  variable 
for  the  resulting  situation  were  unbound,  it  would  indicate  that  we  were  looking  for  a  way 
to  achieve  a  goal,  and  we  would  use  the  backward-chaining  rule.  If  the  variable  for  the 
resulting  situation  were  bound,  it  wc  Id  indicate  that  we  were  trying  to  verify  the  results  of 
a  particular  action,  and  we  would  use  the  forward-chaining  ru,.>. 

Also  note  that  for  D2,  we  do  not  use  the  contrapositive  form.  The  natural 
contrapositive  would  be  a  backward-chaining  rule  for  proving  a  goal  of  the  form 
'•R(:0o(a|,:0ial(x1,x2)),W|lW2),  but  the  only  goals  of  that  form  will  be  attempts  to  prove  that 
a  dialing  action  is  not  possible  because  its  prerequisites  are  not  satisfied.  In  these  cases,  Dla 
is  the  only  appropriate  rule  to  use.  This  is  another  case  where  the  meta-language  provides 
for  more  possibilities  than  the  object  language  requires. 

There  is  one  problem  with  using  D2  as  a  forward-chaining  rule,  however.  Suppose  we 
want  to  prove  something  of  the  form  T(Wj,R«s(Do(A,Dial(C| ,Sf | )),P).  That  is.  we  want  to 
show  that  P  could  be  achieved  by  A  doing  Oial(Cj,Sf|)  in  Wj.  This  would  get  translated  by 
R1  into  the  meta-language  goal  of  showing  that  there  is  some  world  w2  such  that 
R(sDo(:A1:Oial(sC,(sSfi)),Wllw2)  is  true  and  T(w2,P);  ie.  w2  is  a  possible  outcome  of 
Oo(A,Oial(C|iS(j))  happening  in  Wlt  and  P  is  true  In  w2-  The  only  rule  we  have  for 
attacking  a  goal  of  the  form  R(:Oo(:A,:Disl(:C| ,:Sf } Ji.Wj ,w2)  is  Dlb.  If  all  the  prerequisites 
are  satisfied,  this  rule  will  succeed,  leaving  w2  bound  to  the  Skolem  term  F |  (t A.tC j  ,tSf  |  ,W j  > 
(which  we  will  abreviate  as  W2).  and  we  will  try  to  prove  the  remaining  goal  T(W 2,P). 
T  f  picaliy.  showing  this  will  depend  on  the  information  contained  in  D2.  The  trouble  is 
>.«r  [j?  needs  the  explicit  assertion  R(:Do(:A,:Dial(;C|,:Sf|)),Wj,W2)  in  order  to  trigger. 
«  »e  have  just  proved  this  formula,  the  proof  procedure  we  currently  have  does 


not  cause  it  to  be  asserted.  There  is  a  danger  here  that  asserting  everything  that  is  proved 
may  produce  an  explosion  of  forward-chaining  inferences.  Therefore  we  will  make  a 
special  case  of  formulas  of  the  form  Rfev^wj^).  Whenever  anything  matching  this 

pattern  is  proved,  it  will  be  be  explicitly  asserted  in  order  to  give  rules  like  D2  (and,  as  we 
shall  see,  D3)  a  chance  to  fire. 

D3.  R(:Do(aj  t:Dial(X]  ,x2)).w j  ,w2)  *> 

(K(aj,w2,w3)/[w2  /  w3]  <-> 

Ow4(K(aj  ,w  j  ,w4)/[w j  i  w4]  a 
R(:Do(a|,:Dial(x],x2)),w4,w3))  a 
(H(w2,:Opan(x2))  <•>  H(w3,:Opan(x2))))) 

D3  explains  how  dialing  affects  the  knowledge  of  the  agent.  Basically,  it  says  that  after 
dialing,  the  agent  knows  what  action  he  has  performed  and  he  knows  whether  the  safe  is 
open.  How  this  is  expressed  in  terms  of  possible  worlds  was  thoroughly  explained  in 
chapter  3.  D3  is  similar  in  many  respects  to  D2.  It  is  a  forward-chaining  rule  with  no 
contrapositve  interpretation  for  exactly  the  same  reasons  as  D2.  The  consequent  of  D3 
introduces  some  new  concerns.  Just  as  in  its  purely  logical  form,  the  cons^yent  of  D3  is  a 
biconditional.  The  interpretation  of  this  biconditional,  however,  is  different  from  either  of 
the  principal  procedural  interpretations  introduced  in  chapter  6.  (P  <->  Q)  may  simply  be 
regarded  a  syntactic  abreviation  for  simultaneously  expressing  (P  ->  Q)  and  (P  <-  Q).  That 
is,  only  goals  and  assertions  corresponding  to  P  are  dealt  with;  *P  is  ignored.  We  make  this 
restriction  in  D3,  since  the  formula  on  the  left  side  of  <->  is  KUj,w2,w3).  a  formula  whose 
negation  should  never  occur, 

D3  also  contains  syntactic  restrictions  on  some  of  its  subformulas.  To  see  why,  we  need 
to  look  in  detail  at  how  D3  works.  Suppose  W2  is  the  result  of  Oo(A,Oial(C|,Sf|)  occurring 

in  Wj.  D3  will  trigger  on  this  assertion,  producing  the  new  assertion; 


<l)K<tA,W2lW3>/[W2/w3)<-> 

Gw4<K(sA,W,,w4>/IW,  /  w4J  a 

R(:Do(:A,:Di*l(:Cj,:Sf i )),w4,w3))  A 
<H(W2,:Op*n(:SI | ))  <»>  H(w3,:0p«n(:Sfj)») 

This  says  that  any  world  w3  which  is  compatible  with  what  A  knows  in  the  new 
situation  W2  must  be  the  result  of  0o(Af0ial(C  ^  tSf ^ ))  happening  in  some  world  which  was 
possible  according  to  what  A  knew  in  the  old  situation  Wj,  and  agree  with  W2  as  to  whether 
»l  is  open.  The  syntactic  restriction  (W2  /  w3]  prevents  consideration  of  W2  itself  as  a 
binding  for  w3.  One  reason  for  this  restriction  is  that  to  allow  W2  as  a  binding  for  w3 
would  generate  no  real  information.  We  already  know  that  W2  is  the  result  of 
(Do(A,Dial(C|  ,Sfj )))  happening  in  a  world  which  was  possible  according  to  what  A  knew  in 
W|,  namely  Wj  itself;  and  W2  obviously  agrees  with  itself  as  to  whether  Sfj  is  open. 

An  even  more  important  reason  for  the  restriction  [W2  /  w3]  is  to  avoid  generating  an 
infinite  number  of  assertions.  Suppose  we  did  allow  w3  to  be  bound  to  W2.  K2  would  then 
apply,  generating: 

(2)  3w4(K(:A,Wltw4)/tW,  t  w4]  A 

R(:Oo(:A.:Di«l(:C|  ,:Sf ,  )),w4,W2))  a 
<H(W2l:Open(t$f  | ))  <•>  H(W2,;Op«n(:Sf  j ))) 

T^e  last  part  of  this  assertion  could  be  deleted  as  a  tautology,  but  the  first  part  would  be 
Skolemized  and  turned  into  the  two  assertions: 

(3)  K(:A,Wj,W4) 

(4)  R(:Do(:A,:Dial(:Cj  ,tSf  j  )),W4,W2) 

Assertion  (4),  however  would  trigger  D3  all  over  again,  recursing  infinitely.  The  trouble 
is  that  the  Skolem  constant  W4  really  refers  to  the  same  world  as  Wj,  but  our  techniques  are 


not  clever  enough  to  catch  this.  If  we  had  a  more  clever  from  form  of  subsumption,  we 
could  notice  that  the  existentially  quantified  part  of  assertion  (2)  is  already  known  to  be  true 
before  Skolemization  takes  place.  Alternatively,  we  could  have  an  axiom  saying  that  every 
world  is  the  successor  of  exactly  one  other  world  with  respect  to  a  particular  action.  This 
fact  plus  assertion  (4)  would  cause  us  to  conclude  that  W4  equals  Wj,  and  W4  would  be 
replaced  by  Wj  Jn  assertions  (3)  and  (4),  which  would  then  be  deleted  by  subsumption. 

Either  of  these  methods,  however,  is  much  more  complicated  than  using  a  simple  syntactic 
restriction. 

The  other  syntactic  restriction  in  D3  ([wj  /  w4]  in  the  third  line)  is  used  when  the 
consequent  of  D3  is  used  as  a  backward-chaining  rule.  Suppose  we  know  that  in  Wj,  A 

knows  that  the  safe  is  open,  but  does  not  know  that  P  is  true,  where  P  has  nothing  to  do 
with  whether  the  safe  is  open;  and  we  want  to  show  that  after  performing  DiaKCj.Sf  j).  A 

still  does  not  know  that  P  is  true.  Asserting  that  A  already  knows  that  the  safe  is  open  is 
the  easiest  way  to  insure  that  finding  out  whether  the  safe  is  open  after  dialing  will  not 
indirectly  tell  A  whether  P  is  true.  The  forward  inferences  from  the  premises  of  the 
problem  will  include  meta-language  assertions  to  the  effect  that  there  is  some  possible  world, 
say  W4,  which  is  compatible  with  what  A  knows  in  Wj,  i.e.  KOA.Wj.W^),  and  in  which  P  is 
false,  i.e.  ■*H(W4,:P),  and  the  safe  is  open. 

In  proving  that  A  does  not  know  that  P  is  true  after  dialing  the  safe,  we  would  assert 
that  some  world,  say  W2,  is  the  result  of  DolA^iaKC^Sf]))  happening  in  Wj,  which  would 

trigger  D2  and  D3.  As  in  the  previous  example,  D3  would  produce  assertion  (I),  above. 
Then  we  would  try  to  prove  that  there  is  some  world  which  is  compatible  with  what  A 
knows  in  W2  in  which  I1  is  false,  i.e.  K(:A(W2,w3)  a  -HIwj.sP).  K2  provides  one  solution  to 
the  first  goal  in  this  conjunct.  That  is,  one  way  to  prove  that  A  does  not  know  that  P  is 
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true  in  W2  is  to  prove  that  P  is,  in  fact,  not  true  in  W2.  If  this  fails  the  only  other 

applicable  rule  we  have  is  assertion  (I).  Assertion  (I)  is  now  used  as  a  backward-chaining 
rule  and  produces  the  following  conjunctive  goat: 

(5)  K(:A,Wj,w^)/[Wj  i  w4]  A 

R(:Do(:A,:Dial(:Cj,:Sf|  )),w4,w3)  A 
(H(W2,:Open(:Sf  j ))  <•>  H(w3,:0p«n(:Sf  j ))) 

Now  the  syntactic  restriction  in  goal  (5)  comes  into  play.  If  we  let  w4  be  bound  to  Wj, 
then  the  second  conjunct  would  be  solved  by  binding  w3  to  W2,  which  would  give  us  the 

same  case  we  considered  when  we  applied  K2  to  our  top  level  goal.  So  this  restriction,  like 
the  others  we  have  seen  eliminates  a  redundancy  in  our  search  space. 

If  we  continue  with  the  deduction,  we  would  eventually  try  binding  w4  to  W4.  Solving 
the  second  conjunct  would  produce  a  binding  for  w3,  say  W3.  which  is  the  result  of 
Oo(A,Dial(Cj,C2)  happening  in  W4.  Since  the  safe  was  open  in  both  Wj  and  W4.  the  safe 
would  also  be  open  in  both  W2  and  W3,  so  the  third  conjunct  of  goal  (5)  is  also  satisfied. 
This  leaves  us  with  only  the  second  conjunct  of  the  top  level  goal.  -H(W3,iP),  left  to  satisfy. 
Since  W3  is  the  successor  of  W4,  and  'H(W3,:P}  is  true,  the  part  of  D2  that  says  what  does 
not  change  will  let  us  prove  that  -H{W3,:P)  is  true,  completing  the  proof. 

There  is  one  more  comment  to  make  about  D3.  Since  the  output  of  D3  has  one 
interpretation  as  a  backward-chaining  rule  for  proving  goals  of  the  form  K(a|,w|,w2),  and 
since  most  of  the  rules  that  have  KU|,W|,w2)  as  an  antecedent  are  forward-chaining  rules, 
we  may  have  a  problem  proving  goals  like  K(A,Wj,w2)  a  T(w2,P).  A  goal  like  this  could 
come  from  trying  to  show  that  A  doesn’t  know  that  «P  is  true  in  W|.  The  problem  is  that 
that  we  might  use  the  output  of  D3,  i.e.  assertion  (I),  to  solve  the  first  part  of  the  goal,  but 
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MA.Wj^)  might  have  to  be  explicitly  asserted  to  trigger  forward-chaining  rules  to  solve 
the  second  part  of  the  goal.  This  is  essentially  the  same  problem  as  with  R(*vj,wj,W2). 
which  we  pointed  out  in  the  discussion  of  D2.  As  in  that  case,  we  will  make  a  practice  of 
explicitly  asserting  any  formula  that  matches  Kfa^wj^)  which  has  just  been  proved  using 
a  backward-chaining  rule. 

This  concludes  the  analysis  of  the  most  important  rules  which  we  will  use  in  doing 
deductions  that  involve  both  knowledge  and  action.  The  point  of  going  into  so  much  detail 
about  them  is  to  convey  a  feeling  for  the  kinds  of  considerations  that  go  into  making  a 
procedural  deduction  system  work  efficiently.  It  should  also  be  obvious  by  now  that  no 
uniform  inference  procedure  could  hope  to  do  the  right  thing  in  all  these  special  cases.  If 
there  were  a  good  theory  of  this  sort  of  thing,  it  would  probably  be  possible  to  paint  a  more 
coherent  picture  of  what  is  going  on.  Unfortunately,  such  a  theory  does  not  currently  exist 

7.2  An  Example  of  an  Action  which  Requires  Knowledge 

In  the  rest  of  this  chapter,  we  will  examine  in  detail  algorithmically  generated  proofs  of 
our  three  benchmark  examples  of  reasoning  about  knowledge  and  action  from  chapter  I. 
In  the  first  example,  knowledge  is  required  to  achieve  a  goal;  in  the  second,  an  action  is 
used  to  obtain  knowledge,  and  in  the  third,  there  is  a  sequence  of  two  actions,  where  the 
first  action  produces  information  required  by  the  second  action.  The  possible-world 
structures  for  these  proofs  are  the  same  as  for  the  hand  generated  proofs  in  chapter  5.  It 
may  be  of  some  help,  therefore,  to  refer  to  figures  5.2  -  5.5  in  studying  the  examples. 

These  examples  are  long  and  complicated,  so  the  casual  reader  may  prefer  to  skim  them 
or  skip  over  them  entirely.  Some  general  analysis  of  the  examples  is  presented  in  section 
7.5.  The  major  point  made  there  (and  the  thing  to  note  in  the  proofs  themselves)  is  how 
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tightly  the  procedural  information  we  have  built  into  the  axioms  constrains  the  search  for 
proofs.  In  fact,  there  is  almost  no  blind  searching  at  all  and  the  search  space  itself  is  finite. 
The  general  pattern  of  these  proofs  is  that  the  goal  is  transformed  into  an  implication 
which  is  proved  by  asserting  the  antecedent  and  deriving  the  consequent.  The  assertion  of 
the  antecedent  triggers  off  many  forward  deductions  which  describe  the  possible  world 
structure  relevant  to  the  problem,  and  the  consequent  of  the  goal  is  derived  by  doing  simple 
backward-chaining  inferences  in  that  structure.  As  a  result,  although  these  proofs  are  long, 
no  combinatorial  explosion  of  formulas  occurs. 

The  first  example  is  to  show  that  if  John  is  at  the  same  place  as  a  safe,  and  he  knows 
the  combination  to  the  safe,  then  he  can  open  the  safe  by  dialing  the  combination.  As  we 
saw  in  chapter  5,  this  proof  requires  one  auxiliary  fact  in  addition  to  those  already 
discussed: 


Al.  W*1»w1iW2)/[w1  /  w2]  -> 

(H(w2>:At(«|tX| »  v  'H(wj,iAt(*j^j))) 

This  axiom  says  that  if  a  person  is  at  the  same  place  as  some  object,  he  knows  that  he  is 
at  the  same  place  as  the  object.  A 1  is  stated  somewhat  differently  than  in  chapter  5.  Here 
it  is  treated  as  a  forward-chaining  rule  triggered  by  K(ej,wj,w2).  We  restrict  Wj  and  w2 

from  having  the  same  binding  in  order  not  to  allow  K2  to  trigger  the  tautologous 
conclusion  H(wj,:A1(*j,xj))  v  -H(wj,:At(aj,X|)).  The  consequent  is  expressed  as  a  disjunction, 

which  is  procedurally  interpreted  as  two  backward-chaining  rules,  because  there  does  not 
seem  to  be  any  need  to  use  a  fact  of  this  form  as  a  forward-chaining  rule 
We  can  now  algorithmically  generate  the  following  proof  (see  figure  5.2): 

Given:  Tru«(Sa(«(Sf  j)) 

True(At(John,Sf1)) 

True(Exist(fX  1  ,Know(John£q(TX  1  ,Comb(S»  j ))))) 


Prove:  True(Can(John,Dial(Comb(S<  j  ),$f  j  ),0pen($f  j ))) 


1.  *Tru«(Safe(S(j )) 

Given 

2.  :Safe(Sfj) 

L 

3.  •Tru«(At(John,Sf  j )) 

Given 

4.  H(W0,:At(:Joho,:Sf  j )) 

L 

5.  aTrue(Exis»(TX  1  ,Know(John,Eq(Comb(Sf  j  ),TX1 }))) 

Given 

6.  K(:John,W0,wj )  ->  T(w,  ,Eq(Comb(Sf ,  >tB(sC)» 

L,K1 

7.  •T(W0,Eq(Comb{Sf,),B(:C))) 

K2 

8.  V(W0,:Comb(:Sf, ))  •  .-C 

L 

Lines  1  -  8  state  the  premises  of  the  problem  and  the  forward  inferences  made  from 
them.  Line  I  says  that  Sf|  is  a  safe,  and  line  3  says  that  John  is  at  the  same  place  as  the 
safe.  Lines  2,  and  4  translate  these  facts  into  the  meta-language.  Line  5  expresses  the  fact 
that  John  knows  the  combination  to  the  safe,  by  saying  that  there  is  some  entity  which  John 
knows  to  be  identical  to  the  combination  of  the  safe.  In  translating  this  statement  into  the 
meta-language,  we  Jet  :C  denote  this  entity.  The  meta-language  translation  of  line  5  says 
that  in  every  world  which  is  compatible  with  what  John  knows  in  Wq,  the  combination  of 
the  safe  is  :C  (line  6).  Since  W0  is  compatible  with  what  John  knows  in  W0.  we  conclude 
that  the  comination  of  the  safe  in  W0  is  «C  (lines  7  -  8).  This  is  the  meta  language 
expression  of  the  fact  that  the  entity  which  John  knows  to  be  the  combination  to  the  safe,  is 
the  combination  to  the  safe. 


9.  a*Trua(Cin(John,Dial(Comb(Sf|),Sf|),Open(Sf|)))  Coal 

1 0.  *T(W0,Can(John,Oi*l(Comb(Sf  j  ),Sfj  ),0p«n(Sf  j )))  L 

11.  a*T  (Wq, Know  (John,  Cl 

And(Eq<fi(0(W0,Dial(Comb(Sf  j  ),Sf ,  })),Di«l(Comb(Sf  ,  >,Sf  j», 
Re$(Oo(o(D(W0,John)),Dial(Comb(Sf|  ),Sf  j  )),0pan(Sf , ))))) 

1 2.  •*K(:John,W0,Wj )  ->  Kl,l 


T(W|  ,And(Eq(a(0<W0,Dial(Comb(Sf  |  ),Sf  j  ))),Di«l(Comb($l,  ),$!})), 
Ras(Do(a(D(W0,John»,Dial(Comb(Sf  j  ),Sf  j  )),0pen(Sfj )))) 


Lines  9  •  12  state  the  goal  and  its  meta-language  translation.  The  goal  is  to  show  that 


John  can  open  the  safe  by  dialing  the  combination  (lines  9  •  10).  According  to  Cl,  he  can 
do  this  if  he  knows  precisely  what  action  dialing  the  combination  of  the  safe  is,  and  he 
knows  that  dialing  the  combination  of  the  safe  will  result  in  the  safe  being  open  (line  II). 
This  is  re-expressed  as  the  goal  of  showing  that  in  every  world  which  is  compatible  with 
wl  it  John  knows  in  Wq,  dialing  the  combination  of  the  safe  is  the  same  action  as  it  is  in 
Wq.  and  dialing  the  combination  of  the  safe  will  result  in  the  safe  being  open  (line  12). 


13. 

K(:John,W0,W|) 

Ant* 

14. 

K(:John,W],w3)/(W1  /  w3]  ->  K(:John,W0,w3) 

K3 

15. 

H(W j  ,:At(:John,x, ))  v  -H(W0,:At(:Johnpt, )) 

AJ 

16. 

•T(Wj  ,Eq(Comb(Sf  j  ),fi(:C))) 

6 

17. 

V(Wj  ,K^mb(:Sfj ))  ■  :C 

L 

Since  line  12  is  an  implication  goal,  we  assert  the  antecedent  and  try  to  prove  the 
consequent.  We  assert  K{:John,W0,Wj)  which  triggers  K3  (line  14)  and  A1  (line  15).  We 
also  conclude  from  line  6  that  the  combination  of  the  safe  in  Wj  is  :C  (lines  16  -  17). 


1 8.  T(W,  ,And(Eq(«(D(W0,Oi.l(Comb($f  j  ),Sf|  })),Di»l(Comb($f  1  ),SI, )), 

Ras(Do(a(D(W0,John)),Dial(Comb(Sfj  ),St  j  )),0p*n(Sf , )))) 

9.  **T(W|  ,Eq(fi(D(W0,Oial(Comb(Sf  t  ),Sf|  ))),Di«l(Comb(Sf  j  )(Sfj )))  A  Com* 

T  (W  |  ,R«s  (Oo(fi(D(W0,John)),Dial  (Comb(Sf  j  ),$f  j  )),0p*n(Sf , ))) 

20.  •*T(Wj  ,Eq(a(0(W0,0ial(Comb(St|  ),Sf ,  ))),Dial(Comb(Sf  j  ),Sf , )))  Split 

2 1 .  •*:Oial(V(W0,:Comb(:Sf ]  )),:$f j )  ■  :0ial(V(W,  ,:Comb(tSfj  )),sSf j )  L 

22.  «*:Dial(:C,tS(] )  ■  :Dial(V(W,  .-.CombOSIj  )  8 

23.  •*:Dial(:C,:$(|)-:Dial(:C,:Sft)  17 

24.  *T  Eq 

We  now  try  to  prove  the  consequent  of  line  12,  that  in  Wj,  dialing  the  combination  of 
the  safe  is  the  same  action  as  it  is  in  W0,  and  dialing  the  combination  of  the  safe  will  result 
in  the  safe  being  open  (lines  18  •  19).  This  conjunctive  goal  is  split  into  two  subgoals.  First 
we  try  to  prove  that  dialing  the  combination  of  the  safe  in  Wq  is  the  same  action  as  dialing 
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the  combination  of  the  safe  in  Wj  (lines  20  •  21).  Since  the  combination  of  the  safe  is  tC  in 
both  W q  and  Wj,  this  goal  is  transformed  into  the  goal  of  showing  that  the  action  of  dialing 
sC  on  :Sf  j  is  identical  to  itself  (line  22  -  23).  This  simplifies  to  proving  T  (line  24).  so  this 
branch  of  the  split  succeeds. 

25.  »«T(W,  ,R«s(Do(e(D(WgrJohn))lDial(Comb(S(|  ),Sf  ]  )),0pen(Sf  j )))  Split 


26.  •*R(:Do(!John,sDisl{V(W1,:Comb(tSf1)),sSi1)),Wllw2>  A  R1.L 

T<w2,Open(Sf,» 

27.  •*R(:Oo(:Johnl:Dial(V(Wj  ,:Comb(iS(j  |  )),Wj  ,w2)  Split 

28.  «R(:Oo(:John,:Oi<l(:C,:Sf|)).W|lw2)  17 

29.  •*(V(w3,:Comb(sSf|))-  :C)  a  Dlb 

:Sala(:$t| )  a  H(W,  ,jAt(:John,:S<1 )) 

30.  *V(w3,tComb(tSf|))a  tC  Split 

3 1 .  «*:C  ■  :C  8 

32.  *T  Eq 

33.  *:$ale(:Sfj)  Split 

34.  *T  2 

35.  *H(W  j  ,:At  (:  John,:Sf  j ))  Split 

36.  *H(W0,:At(:John,:Sf]  I)  15 

37.  *T  4 


The  other  branch  of  the  split  is  to  show  that  dialing  the  combination  of  the  safe  in  Wj 

will  result  in  the  safe  being  open  (line  25).  This  reduces  to  showing  that  it  is  possible  to  for 
John  to  dial  the  combination  of  the  safe  in  Wj,  and  that  in  the  resulting  situation  the  safe 
is  open  (line  26).  We  split  this  goal,  and  try  first  to  show  that  it  is  possible  for  John  to  dial 
the  combination  of  the  safe  in  Wj  (line  27).  Since  the  combination  of  the  safe  in  Wj  is  :C, 

this  is  transformed  into  showing  that  it  is  possible  for  John  to  dial  tC  on  S4|  in  Wj  (line  28). 
According  to  Dlb,  this  can  be  done  if  &  is  a  possible  combination  of  Sfj,  Sfj  is  a  safe,  and 
John  is  at  the  same  place  as  Sfj  in  W|  (line  29).  This  goal  splits  three  ways.  The  first 
subgoal  is  satisfied  by  the  fact  that  :C  is  the  combination  of  Sfj  in  Wq  (lines  30  -  32),  and 
the  second  is  satisfied  by  the  fact  that  Sfj  is  asserted  to  be  a  safe  (lines  33  -  34).  The  third 
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subgoal  is  that  John  is  at  the  same  place  as  the  safe  in  Wj  (line  35).  According  to  line  15, 
this  is  true  if  John  is  at  the  same  place  as  the  safe  in  Wq  (line  36).  But  this  is  one  of  the 
premises  of  the  problem,  so  we  have  solved  this  subgoal  (line  37),  and  also  the  con junctive 
goal  on  line  29,  and  the  goal  on  line  28. 


38. 

R(:Do(:John,:Dial(:Cl:Sf  j  )),Wj  ,W2) 

Solved 

39. 

V(W3,:Comb(:Sfj))  ■  tC 

Dla 

40. 

•:Safa(:S(] ) 

Dla 

41. 

■H(W  j At  (:  John,:Sf  j )) 

Dla 

42. 

H(W2,int.pj)  <■> 

(((int.pj  ■  :0pon(:$fj ))  A 

((V(Wj,:Comb(:Sfj ))  «  :C)  v  H(W,,:Open(sSf|))))  V 
((int.pj  /  :0pen(:Sf  j })  A  H(Wj  .int.pj ))) 

D2 

43. 

V(W2,mt.trm,)  ■  V(Wj,int.trmj) 

D2 

44. 

K(:John,W2,w3)/[W2  /  w3]  <-> 
Qw4(K{:John,W1,w4)/(W1  /  w4]  A 

R(:Oo(:John,:Disl(.*Cl:Sf  j  )),w4,w3))  a 
(H(W2,:0p*n(:S<i ))  <•>  H(w3,:Opan(:Sfj)))) 

D3 

Since  line  28  is  a  solved  goal  which  matches  R(evi,W|,w2),  we  now  turn  the  solution  to 
line  28  into  an  assertion.  Line  28  was  solved  by  binding  w2  to  F](sJohn,:C,:Sfj,W|),  but  for 
simplicity  we  abreviate  this  as  W2  (line  38).  This  assertion  triggers  several  forward¬ 
chaining  rules.  Dla  produces  assertions  that  :C  is  a  possible  combination  of  Sfj,  $fj  is  a 
safe,  and  John  is  at  the  same  place  as  Sfj  in  Wj  (lines  39  -  41).  Basically,  all  this 

information  is  redundant.  If  we  made  the  algorithm  a  bit  more  clever,  it  might  notice  that 
Dla  and  Dlb  together  form  a  biconditional,  and  since  we  just  proved  line  38  using  Dl,  we 
already  know  all  the  information  that  it  contains,  and  it  is  unnecessary  to  trigger  Dl  again 
as  a  forward-chaining  rule.  Line  38  also  triggers  D2,  which  produces  assertions  describing 
the  physical  effects  of  dialing  tC  on  Sfj  (lines  42  •  43),  and  D3,  which  produces  an  assertion 
describing  the  effects  of  the  action  on  John’s  knowledge. 
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45.  •*T(W2,Open(St|M  Split 

46.  •*H(W2,:0p«n(:Sfi))  L 

47.  *«(:Open(:Sf  j )  ■  :Open(:Sf  j ))  A  42 

((V(W,,:Comb(iSf|))  ■  iC)  v  H(W, ,tOp«n(s$f| ») 

46.  a*(:Open(:$fj)  ■  :Op*n(:Sfj»  Split 

49.  *T  Eq 

50.  •*V(WI,:Comb(:Sfj))>sC  Split 

51.  **:C  ■  sG  17 

52.  *T  Eq 


Now  we  try  to  solve  the  second  branch  of  the  split  of  line  26.  Taking  the  binding  from 
the  solution  to  the  first  branch,  we  try  to  show  that  the  safe  is  open  in  W2  (line  45  -  46). 

This  is  a  question  about  the  physical  effects  of  dialing  tC  on  Sf|.  so  line  42  is  used  (line  47). 

Since  we  are  asking  about  whether  the  safe  is  open  (lines  48  -  49),  we  can  solve  our  goal  by 
showing  that  :C  is  the  combination  to  the  safe  in  Wj  (line  50).  But  we  know  this  is  true,  so 

the  current  subgoal  is  satisfied  (lines  51  -  52).  Since  this  is  the  last  branch  of  the  last  split, 
the  entire  proof  is  complete. 


7.3  An  Example  of  an  Action  which  Produces  Knowledge 

In  this  example,  we  assume  that  Cj  is  the  combination  of  Sfj  and  that  John  knows  that 
$f|  is  not  open.  We  show  how  to  algoithmically  generate  a  proof  that  if  John  tries  to  open 
Sfj  by  dialing  C|,  he  will  know  that  Cj  is  the  combination  of  Sj.  This  proof  involves  the 
facts  that  after  dialing  Cj  John  knows  that  he  tried  to  open  the  safe,  he  knows  whether  the 

safe  is  open,  and  he  understands  how  the  safe  being  open  depends  on  whether  the 
combination  of  he  dialed  is  the  combination  of  the  safe.  (See  figure  5.3.) 

Given:  True(Know(John,Not(Open(Sf j )))) 

True(Eq(Comb($f  j  ),Cj » 

Prove:  True  (Rest  (Do(John,Di»l(Cj  ,$!j  )),Know(John,£q(Comb(Sf  j  ),Cj )))) 
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1.  ■Tru«(Koow(John,Not(Op«n(Sf|)))) 

2.  K(:John,W0,w  j )  ->  T(w  j  ,Nol(Opan($f  j ))) 

3.  *T(W0,Not(Op«n(Sf| ))) 

4.  -H(W0,:Op«n(:SI| )) 

5.  •Tru«(Eq(Comb(Sf  j  ),C j )) 

6.  VtWQ.sCombtsSf!))-  sC, 

Line  I  is  the  premise  that  John  knows  that  the  safe  is  not  open,  and  line  2  is  its  meta¬ 
language  translation.  Since  John  knows  that  the  safe  is  not  open,  we  can  conclude  that  the 
safe  is  not  open  (lines  3  -  4).  Line  5  is  the  premise  that  Ct  is  the  combination  of  the  safe, 

and  line  6  is  its  meta-language  translation. 

7.  a*True(R«sl (Do(John,Dial(C| ,Sf  j  )),Know(John,Eq(Comb(Sf  j  ),Cj )))) 

8.  a*R(:Do(:John,:Dial(:C,  ,:Stj  )),Wq,W|  )  -> 

T(W,  ,Know  (John.Eq  (Comb  (Sf  j  ),Cj ))) 

9.  R(:Do(:John,:Dial(:C,  ,:Sf|  )).W0,W, ) 

10.  V(W3,:Comb(:Sf, »  ■  :Cj 

11.  :Safe(:Sf, ) 

12.  H(W0,:At(:John,:Sf1)) 

13.  H(W|,int.p|)  <■> 

(((int.pj  ■  :Op«n(:S(|))  a 

((V(W0,:Comb(:S<| ))  ■  jC,)  v  H(W0,;Open(xSf  j »»  v 
((int.pj  /  :Op«n(:Sf|))  A  H(Wq, int.pj))) 

1 4.  V(W, .int.trm, )  •  V(W0,inl.trmj ) 

15.  K(:John,W j  ,w3)/(W j  /  w3]  <-> 

Pw4(K(:John,W0,w4)/[W0  f  w4]  a 

R(:Do(iJohnp:Di*l(:C|  ,:Sf  j  )),w4lW3»  A 
(H(Wj  ,:Op«n(:Sf  j ))  <»>  H(w3,K)p«n(:S<])))) 

Line  7  is  the  goal  of  showing  that  if  John  dials  Cj  on  Sf  j  he  will  find  out  that  Cj  is  the 
combination  of  $fj.  R6  transforms  this  into  the  goal  of  showing  that  if  Wj  is  the  resuit  of 
John  dialing  Cj  on  Sf]  in  W0,  then  in  Wj  John  knows  that  Cj  is  the  combination  of  Slj 
(line  8).  Since  this  is  an  implication,  we  assert  the  antecedent,  and  try  to  prove  the 
consequent.  Asserting  that  Wj  is  the  result  of  John  dialing  Cj  on  Sf  j  in  Wq  (line  9)  triggers 


Goal 

L,R6 

Anto 

Dla 

Ola 

Ola 

02 


02 

03 


Given 

L,K1 

K2 

L 

Given 

L 


several  forward-chaining  rules.  Dla  produces  assertions  that  Cj  is  a  possible  combination 
of  Sfj,  that  Sfj  is  a  safe,  and  that  John  is  at  the  same  place  as  Sfj  (lines  10  -  12).  D2 
produces  assertions  specifying  the  physical  effects  of  John  dialing  Cj  on  Sfj  (lines  1$  •  14), 
and  D3  produces  a  specification  of  the  effects  of  the  action  on  John's  knowledge  (line  15). 


16.  «*T  (W  |  ,Know(John,Eq(Comb(S(  i  ),C  j )))  Cons* 

17.  «*K(:John,W1,W2)->T(W2,Eq(Cofflb(S(|)>C1))  L,K1 

18.  K(:John,W|,W2)  Ant* 

19.  K(:John,W2,w3)/[W2  /  w3]  <  K(sJohn,W,  ,w3)  K3 

20.  >H(W2,:AI(:John,x1))v^H(W1,tAt(iJohn,x1))  A1 

2 1 .  •H(W2,:At(:John,x , ))  v  13 

(((!At(:John,x2)  /  :0psn(:Sfj))  v 
«V(W0l:Comb(:Sf  j ))  /  :Cj)  V  -H(W0,:0p*n(:Sf j ))))  A 
((:At(sJohn(xj)  ■  :0psn(:Sfj))  V  -H(W0lint.pj))) 

22.  H(W2l:At(:Johnlx1))v-H(W0liAt(:John^1))  Eq 


Now  we  try  to  prove  the  consequent  of  line  8,  that  in  W|  John  knows  that  C)  is  the 
combination  of  Sfj  (line  16).  This  is  transformed  into  the  goal  of  showing  that  in  every 
world  which  is  compatible  with  what  John  knows  in  Wj,  Cj  is  the  combination  of  Sf |  (line 
17).  This  is  an  implication  so  we  assert  the  antecedent,  letting  W2  be  a  ypical  world  which 
is  possible  according  to  what  John  knows  in  Wj  (line  18).  This  triggers  K3  (line  19)  and 
A I  (line  20).  The  result  of  A I  is  the  assertion  that  either  John  is  at  t*.e  same  place  as  the 
safe  in  W2.  or  he  is  not  at  the  same  place  as  the  safe  in  Wj.  Since  whether  John  is  at  the 
same  place  as  the  safe  in  Wj  depends  on  the  physical  effects  of  John  dialing  Cj  on  $f|  in 
Wq,  line  13  applies  to  this  assertion.  The  occurrence  of  ->K(W|,:A({:John,:Sfj))  in  line  20  is 

replaced  by  the  corresponding  instance  of  the  right  side  of  line  13  (line  21),  but  since  John 
being  at  the  same  place  as  the  safe  is  unaffected  by  this  action,  this  formula  simplifies  to  the 
assertion  that  either  John  is  at  the  same  place  as  the  safe  in  W2  or  he  is  not  in  the  same 
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UNCLASSIFIED 
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place  as  the  safe  in  Wq. 


23. 

K{;John,W0,W4) 

15 

24. 

K(:John,W4,w3)/[W4  /  w3]  ->  K(tJohn,W0,w3) 

K3 

25. 

H(W4,:At(:John,xj ))  V  'H(W0,:At(tJoh4V«j) 

A1 

26. 

•T  (W4,Not(Op#n(Si  j ))) 

2 

27. 

•'H(W4,K)p*n(:Si1 )) 

L 

28. 

R(:0o(:John,:0ial(:C  j  ,:Sf  1  )),W4,W2)) 

15 

29. 

V(Wg,:Comb(:S(| ))  ■  iCj 

Die 

30. 

Die 

31. 

H(W4,:At(:John,:Sfj )) 

Die 

32. 

H(W2,int.pj)  <*> 

(((int.pj  ■  :0pen(:Slj))  A 

((V(W4,:Comb(:$f  j ))  -  :C,)  V  H(W4>:0p«n(:Sf  j ))))  V 
((int.p,  /  :0p#n(:Sf,))  A  H(W4,int.p, ))) 

D2 

33. 

•(((:At(:John,:Sfj  >  ■  :0p«n(:Sfj))  A 

((V(W4,:Comb{:Sfj))  •  iCj)  v  K(W4,tOpen(iSft))))  v 
((:Ai(:John,:Sf|)  /  :Op«n(:Sft ))  A  H(W4,int j>j  »)  v 
-H(W0l:At(:John,X|)) 

22 

34. 

«H(W4l:At(:John,X| ))  v  -H{W0,iAt (tJohiv«i » 

E4 

35. 

V(W2,int.trmj )  •  V{W4,int.trmj) 

D2 

36. 

K{:John,W2.w3)/[W2  /  w3J  <-> 

Pw4(K(:John1W4,w4)/[W4  /  w4]  A 

R(:Do(:John,:DUI(:C|  ,:Sf  j  )),w4,w3))  A 
(H(W2,:Op«n(:Sf|)>  <■>  H(w3,:0pen(:Sf])))) 

D3 

The  assertion  on  line  18  that  W2  is  compatible  with  everything  John  knows  in  Wj  also 

triggers  line  15  as  a  forward-chaining  rule.  This  results  in  assertions  to  the  effect  that  there 
is  some  world,  say  W4,  such  that  W4  is  compatible  with  everything  that  John  knows  in  Wq 

(line  23),  and  W2  is  the  result  of  John  dialing  C|  on  Sfj  in  W4  (line  28),  and  that  W2  agrees 
with  W]  as  to  whether  $fj  is  open  (line  35).  The  assertion  that  W4  is  compatible  with  what 
John  knows  in  W0  triggers  K3  (line  24)  and  A1  (line  25).  We  also  conclude  that  the  safe  is 
not  open  in  W4  (lines  26  -  27),  since  this  Is  something  John  knows  in  Wq. 

The  assertion  that  W2  is  the  result  of  Do{John,Di»l(ClfSfj»  happening  in  W4  triggers  Dla 
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to  assert  that  C]  is  a  possible  combination  of  $(]  (tine  29),  that  $fj  is  a  safe  (line  $0),  and 
that  John  is  at  the  same  place  as  the  safe  in  W4  (line  SI).  D2  triggers,  producing  an 
assertion  which  describes  how  the  physical  conditions  in  W2  depend  on  its  being  the  result 
of  Do(John,Dial(Cj,Sfj))  happening  in  W4  (line  32).  Since  line  22  involves  a  physical 
condition  in  W2,  this  assertion  is  immediately  applied  as  a  substitution  rule.  The  instance 
of  •'H(W2,:AU:Johnl:Sf  j))  in  line  22  is  replaced  by  the  appropriate  instance  of  the  right  side 
of  line  32.  Since  the  proposition  in  question  has  nothing  to  do  with  the  safe  being  open, 
this  expression  simplifies  to  -HW^.sAHiJohn.sSf  j )).  leaving  the  whole  expression  as  it  is  on 
line  34.  Since  this  repeats  line  25.  it  is  deleted. 

Some  comment  should  be  made  on  this  last  set  of  steps.  We  effectively  had  one 
assertion  that  John  knows  where  he  before  the  dialing  action,  and  another  assertion  that  he 
knows  where  he  is  after  the  dialing  action.  But  since  the  dialing  action  does  not  affect 
where  John  is,  and  he  knows  this,  one  of  these  two  assertions  is  redundant;  we  could  deduce 
either  of  them  given  the  other.  By  using  the  frame  axiom  for  dialing,  we  transformed  one 
of  these  assertions  into  the  other,  enabling  us  to  recognize  the  redundancy  and  eliminate  it. 

Line  35  is  the  rest  of  the  frame  axiom  for  dialing,  noting  that  dialing  does  not  change 
the  reference  of  any  term  expressions.  The  last  rule  triggered  by  the  assertion  that  W2  is 
the  result  the  dialing  happening  in  W4  is  D3,  which  produces  a  description  of  the  effects  of 
the  action  on  John’s  knowledge  in  W2  (line  36). 

37.  •H(W,,:0p«n(iSil»  <»>  HWj.sOpenhSfj »))  15 

38.  *(((:0p«n(:$f j )  ■  :0p«n(:Sfj))  a  13 

(<V(W0,:Comb(:Sfj))  »  .-Cj)  v  HCW^sOpenOS^))))  v 
((:0p«n(:Sf ] )  /  s0p«n(:Sfj))  a  H(W0,int.p|)))  <■> 

H(W2,t0p«n(tSfj )) 

39.  *((V(W0,:Comb(:Sf| ))  -  !Cj )  v  H(W0,:Open(:S<, )))  <->  Eq 

H(W2,:0p«n(:S(1)) 


€ 
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40.  *((:C,  •  tCj  >  v  H(W0,K)p«n(iSf  j )))  <■>  5 

H(W2,:0p«n(:Sf|)) 

41.  •H(W2,:0p«n(:S<, ))  Eq 

42.  «((:0p«n(:Sf  | )  >  K)p«n(:S(j ))  A  32 

((V(W4,:Comb(:Sf  i ))  ■  sCj)  v  H(W4,K)p«n(sSf,»))  v 
((:0p«n(:Sf] )  /  :Opan(:Sf, ))  A  H(W4,inl.pj )) 

43.  (V(W4,:Comb(:Sfj))  ■  :C|)  v  H(W4,K)p«n(tSf|))  Eq 


Now  we  pop  back  up  and  resume  considering  the  consequences  of  the  assertion  on  line 
18  that  W2  is  compatible  with  what  John  knows  in  Wj.  The  last  inference  which  is  drawn 
from  this  assumption  and  line  15  is  that  the  safe  is  open  in  W2  if  and  only  if  it  is  open  in 
Wj  (line  38).  Both  sides  of  this  assertion  refer  to  physical  conditions  in  situations  to  which 
a  frame  assertion  applies,  line  13  for  W|,  and  line  32  for  W2.  We  first  replace 
H(W|  ,K)p«n(Si| ))  in  line  37  with  the  matching  instance  of  the  right  side  of  line  13  (line  38). 
This  expression  simplifies  to  (V{W0l:Comb(:Sfi))  ■  iCj)  v  H(W0,K>p«n(:Sf|))  (line  39).  Since 
we  know  that  C]  is  the  combination  of  $f|  in  W q,  the  left  side  of  line  39  simplifies  to  T. 

Since  this  is  a  biconditional,  we  conclude  that  the  right  side  is  also  true  (lines  40  *  41). 
leaving  us  with  the  assertion  that  the  safe  is  open  in  W2.  We  now  apply  the  frame  assertion 
for  W2  to  line  41.  and  the  resulting  expression  simplifies  to  the  assertion  that  either  Cj  is 
the  combination  of  the  safe  in  W4  or  the  safe  is  open  in  W4  (lines  42  *  43).  Since  W2  is  the 
result  0o(John,0ial(C|t$f]))  happening  in  W4,  these  are  the  only  two  alternatives  that  could 
lead  to  the  safe  being  open  in  W2. 


44. 

a*T(W2,Eq(Comb(Sf  |  ),Cj )) 

Cons* 

45. 

•*V(W2,:Comb(j$f i ))  ■  jCj 

L 

46. 

*V(W4,:Comb(:Sf]))  »  tCj 

35 

47. 

*'H(W4liOp«n(tSf] ))) 

43 

48. 

*T 

27 
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Finally  we  come  to  trying  to  prove  the  consequent  of  line  17,  given  all  the  conclusions 
which  we  have  drawn  from  the  antecedent  The  goal  is  to  show  that  C|  is  the  combination 

of  $f|  in  W2  (lines  44  -  15).  Since  dialing  does  not  change  the  combination  of  the  safe,  this 
is  equivalent  to  showing  that  Cj  is  the  combination  of  $f|  in  W4  (line  46).  According  to 
line  43,  we  can  show  this  if  we  can  show  that  the  safe  is  not  open  in  W4  (line  47).  But  we 
already  know  that  this  is  true  from  line  27,  so  the  proof  is  complete  (line  48). 

7.4  An  Example  of  Acquiring  Knowlege  Required  for  an  Action 

Our  final  example  is  to  produce  a  proof  that  if  John  has  a  piece  of  paper  with  the 
combination  of  the  safe  written  on  it,  if  he  can  read,  and  if  he  is  at  the  safe,  then  he  can 
open  the  safe  by  reading  the  piece  of  paper  and  dialing  the  combination.  This  requires  the 
introduction  of  a  new  action  Read,  and  the  associated  predicates  Reads  and  Mo. 

INFI.  T(wj,lnfo(trm.X|,oxpj)}  <*>  (V(W| ,:lnfo(D(w j ,trmJ(j )))  ■  axp|) 

RDSI.  K(aj  ,W|,w2)/[wj  /  w2J  -> 

(H(w2,:Roads(aj ))  v  -H(wj  ,:Raads(aj ))) 

An  object-language  formula  of  the  form  lnfo(X,Exp)  means  that  the  object  X  has  the 
information  Exp  written  on  it,  where  Exp  is  some  well-formed  object-language  expression. 
INFI  translates  this  into  the  meta-language.  An  object-language  language  formula  of  the 
form  R«ads(A)  means  that  A  can  read.  We  will  treat  Raadt  as  though  it  were  a  simple 
physical  predicate,  so  its  translation  into  the  meta-language  will  be  handled  by  L9a.  RDS I 
says  that  anyone  who  can  read  knows  that  he  can  read.  This  rule  is  expressed  in  exactly 
the  same  form  as  A I. 

ROla.  R(:Oo(a|,tR««d(X|)),W|,w2)  ■> 

(H(wj,:R«ads(«|))  a  H(w,,:AIUjptj))) 


R01  b.  R(:Do(«,  ,:Re*d(x,  )),Wj  ,F2U ,  ,w, ))  <■ 

(H(w,,:Reads{«,))  A  H(w,,:At{a,,x,))) 

RD2.  R(:Do(«j,:R«*d(xj)),Wj,W2)  -> 

(K(«j  iw2>w3)/[w2  /  w3]  <•> 

3w4(K(«i,W|,w4)/[Wj  /  w4]  A 

(V(w4,:lnfo(xj))  ■  V(w,,:lnfo(x,)))  A 
R(:Do(a,  ,:Read(x,  )),w4,w3))) 

RD3.  R(:Do(i|,:R««d(X|)),W|lW2)  -> 

((V(w2,int.trm j )  ■  V(wj,int.trmj ))  A 

(H(w2Mpj  )  <■>  H(WjMPj)) 

Reed(X)  is  the  object-language  representation  of  the  action  of  reading  the  information 
written  on  the  object  X.  RDla  and  RDIb  specify  the  prerequisites  for  reading  something; 
the  agent  has  to  be  able  to  read,  and  he  has  to  be  at  the  same  place  as  the  thing  he  is  going 
to  read.  These  to  rules  are  expressed  in  the  same  form  as  Dla  and  Dlb.  RD2  gives  the 
effects  of  reading  on  the  knowledge  of  the  agent  It  says  in  the  usual  form  that  he  knows 
what  was  written  on  the  object,  and  he  knows  that  he  has  just  read  what  was  written  on  the 
object.  RD3  describes  the  physical  effects  of  reading.  Since  there  really  aren't  any,  it  just 
says  that  all  physical  conditions  are  the  same  as  they  were  before  the  action  took  place. 

The  proof  is  as  follows  (see  figure  5.5): 

Given:  True($afe(Sf,)) 

True(At(John,Sf  j )) 

True  (At  ( John, Ppr  j )) 

T  rue  (Reads  (John)) 

T  rue  (Know  ( John, Exit  t(?X  1  ,And(Eq(Comb(Sf  j  ),TX1 ), Info  (Ppr,  ,TXI ))))) 

Prove:  True(Can(John,(Read(Ppr,)|  Oial(Comb(Sf ,  ),SI,  )),0pen(Sf , ))) 


1.  eTrue(Safe(Sf,))  Given 

2.  :$afe(Sf, )  L 

3.  eT rue  (At  ( John.Sf , ))  Given 

4.  H(W0,:At(iJohn,  :$»,))  L 

5.  eTrue(At(John,Ppr,))  Given 

6.  H(W0,:At  (:  John, :Ppr, ))  L 


7.  ■True(Reads(John))  Given 

8.  H(W0,:R«ads(:John»  L 

9.  *Trua  (Know  ( John, Exist  (7X  1  ,And(Eq(Comb(Sf , ),TX1  ),lnfo(Pprj,TXI )))))  Given 

1 0.  K(:John,W0,Wj )  ->  T(w, ,Exist(?Xl ,And(Eq(Comb(S! , ),TX1  ),lnfp(Ppr1  ,TX1 ))))  L,K1 

11.  «T  (Wq, Exist  (TX 1  ,And(Eq(Comb(S<  j  ),TX  1  ),lnfo(Ppr  j  ,?X  1 ))))  K2 

12.  V(W0,:Comb(:Sf|))  ■  KJq  L 

13.  V(W0,ilnfo(:Ppr,))>«(tC0)  L 


Lines  1  -  8  give  the  first  four  premises  and  their  meta-language  translations.  These  are 
the  assertions  that  Sf)  is  a  safe,  that  John  Is  at  the  same  place  as  the  safe,  that  John  is  at 
the  same  place  as  the  piece  of  paper  Pprj,  and  that  John  can  read.  Line  9  says  that  John 
knows  that  there  is  some  entity  which  is  the  combination  of  the  safe,  and  which  is  the 
information  written  on  the  piece  of  paper;  i.e.  John  knows  that  the  combination  of  the  safe 
is  the  only  thing  written  on  the  piece  of  paper.  K 1  transforms  this  into  the  assertion  that  in 
every  world  which  is  compatible  with  what  John  knows  in  the  actual  world,  the 
combination  of  the  safe  is  the  only  thing  written  on  the  piece  of  paper  (line  10).  K2  triggers 
this  rule  to  assert  that  the  combination  of  the  safe,  represented  by  iCq.  is  actually  the  only 
thing  written  on  the  piece  of  paper  (linei  i  t  -  IS). 


14.  ■*Tru«(C*n(John,(R«ad(Ppr|);  Di*l(Comb(Sf  j  ),S( j  )),0p«n($f  j )))  Goal 

15.  »T  (WQ,Carv(John,Raad(Ppr  j ),  L,C2 

Can(n  (D  ( W0,John)),Di»l  (Comb  (Sf  |  ),Sf ,  ),0pen($f , )))) 

16.  **T(Wo,Know(John,And(Eq(o(D(W0,Raad(Ppr  j  ))),Read(Ppr  j )),  Cl 

R«s(Do(o(D(W0,John»,Read(Ppr , )), 

Can(e(D(WQ,  John)), Dial  (Comb(Sf  j  ),Sf ,  ),0pen(S( , }))))) 

17.  •*K(:John,W0,W,)->  K1 

T(W,  ,And(Eq(o(D(W0,Raad(Ppr,  ))),Raad(Ppr, )), 

Ras(Do(O(D(W0rlohn)),Raad(Ppr, )), 

Can(o(D(W0rJohn)),Dial(Comb(S(|  ),Sf ,  ),0pan(Sf  j ))))) 


Line  14  states  the  goal  that  John  can  open  the  safe  by  reading  the  piece  of  paper  and 
dialing  the  combination  of  the  safe.  C2  expands  this  into  the  goat  that  by  reading  the  piece 
of  paper  John  can  bring  it  about  that  he  can  open  the  safe  by  dialing  the  combination  (line 
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15).  Cl  expands  this  into  the  goal  that  John  knows  what  action  reading  the  piece  of  paper 
is,  and  he  knows  that  reading  the  piece  of  paper  would  bring  it  about  that  he  can  open  the 
safe  by  dialing  the  combination  (line  16).  Finally,  Kl  transforms  this  into  the  goal  that  if 
W|  is  a  typical  world  which  is  possible  according  to  what  John  knows  in  Wq,  then  it  is  true 

in  Wj  that  reading  the  piece  of  paper  is  the  same  action  that  it  is  in  the  actual  world,  and 
that  John  reading  the  piece  of  paper  would  bring  it  about  that  he  can  open  the  safe  by 
dialing  the  combination  (line  17). 


18.  K(:John,W0,W,)  Ant* 

19.  K{:John,W|lw3)/[W| /w3]->K(John,W0,w3)  K3 

20.  H(W, ,:At(:John,x, ))  v  -H(W0,:A»( JohM, ))  A1 

2 1 .  H(Wj  ,:R«ads(:John))  v  -H(W0,:Raadt(:John))  RDS1 

22.  «T  (W  |  .Exist  (TX 1  ,And(Eq(Comb(S(  j  ),TX  I  ),lnlo(Ppr  j  ,?X  1 ))))  10 

23.  V(W|l:Comb(iSI|))  ■  :Cj  L 

24.  V(Wj  ,:lnfo(:Ppr, ))  •  )  L 


To  prove  the  implication  on  line  17,  we  first  assert  the  antecedent,  that  Wj  is  compatible 
with  what  John  knows  in  W0  (line  18).  This  triggers  K3,  Al,  and  RDS1  (lines  19  *  21).  It 
also  triggers  line  10  to  assert  that  the  combination  of  the  safe  in  Wj,  represented  by  tCj,  is 
the  only  thing  written  on  the  piece  of  paper  in  Wj  (tines  22  •  23). 


25. 

•*T(W,  ,And(Cqi«(D(W0,Reid(Ppr ,  ))),Read(Ppr| )), 
R«s(Do(a(D(W{),John)),R«ad(Ppr  j )), 
Can(o(D(W0,John)),Dial(Comb(Sf ,  J.Sfj  ),0p*n(Sf, ))))) 

Cone* 

26. 

•*T(Wj  ,Eq(«(D{W0,Read(Ppr  j  ))),Re»d(Ppr , )))  A 

T(W,  ,R#»  (Do(o(0(W0,  John)),R**d(Ppr  j )), 
Can(a(D(W0,John)),Oial(Cemb(SI  j  ),Sf  t  ),0ptn(Sf  ( )))) 

L 

27. 

•*T(W,  ,Eq(«{D(W0,Raad(Ppr,  ))),Read(Ppr, ))) 

Split 

28. 

»*:R«ad(:Ppr|)  ■  tRead(iPprj) 

L 

29. 

*T 

Eo 

Now  we  try  to  prove  the  consequent  of  line  17,  the  goal  that  reading  the  piece  of  paper 
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in  Wj  is  the  same  action  that  it  is  in  the  actual  world  and  that  John  reading  the  piece  of 
paper  in  Wj  would  bring  it  about  that  John  can  open  the  safe  by  dialing  the  combination 
(lines  25  -  26).  Since  this  is  a  conjunctive  goat,  we  split  it  into  two  subgoals.  The  first 
subgoal,  showing  that  reading  the  piece  of  paper  in  Wj  is  the  same  action  as  it  is  in  the 

actual  world,  is  easily  solved  if  we  treat  Par)  as  a  rigid  designator  for  the  piece  of  paper. 
This  amounts  to  assuming  that  John  knows  what  object  Pprj  is.  If  he  knows  what  object 

he  is  supposed  to  read,  then  he  certainly  knows  what  action  reading  that  object  is  (lines  27  • 
29). 


30.  ««T(W|  ,Ret(Do(fi(D(WQ,John))lR««d(Ppr  j )),  Split 

Can(a(D(W0lJohn)),Dial(Comb($f  j  ),SI ,  ),0pen(Sf  j )))) 

31.  a«R(0o(:John,R«ad(Pprj))lW|,W2)  A  R1,L 

T(w2,Can(e>(D(W0,John)),Dial{Co.7ib(Sf ,  ),Sf  j  ),0pan(Sf , )))) 

32.  *R(Oo(:JohnlR«ad(PprI)),W]lw2)  Split 

33.  ,:Reade(:John))  A  H(Wj ,*Udohn,iPprl »  RDlb 

34.  «H(W,,:Raads(:John))  Split 

35.  *H(W0,:R«adc(:John))  21 

36.  *T  8 

37.  *H(W,  .tAUzJohn^Ppri ))  Split 

38.  «H(WQ,:At(:John,:Ppr]))  20 

39.  «T  6 


Now  we  try  to  prove  the  second  subgoal  which  results  from  splitting  line  26,  the  goal 
that  John  reading  the  piece  of  paper  in  W|  would  bring  it  about  that  John  can  open  the 
safe  by  dialing  the  combination  (line  30).  R I  transforms  this  into  the  goal  that  there  is 
some  world  which  is  the  result  of  John  reading  the  piece  of  paper  in  Wj  and  in  which  John 
can  open  the  safe  by  dialing  the  combination  (line  31).  This  goal  is  split,  with  the  first 
subgoal  being  to  find  a  world  which  is  the  result  of  John  reading  the  piece  of  paper  in  Wj 
(line  32).  RDlb  says  there  is  such  a  world  if  John  can  read,  and  if  John  is  at  the  same 
place  as  the  piece  of  paper  (line  33).  This  goat  is  also  split,  and  we  first  try  to  show  that 


John  can  read  in  Wj  (line  34).  Since  if  John  can  read,  he  knows  he  can  read,  this  goat  is 
satisfied  if  John  can  read  in  the  actual  world  (line  35).  This  was  one  of  the  premises  of  the 
problem,  so  this  branch  of  the  split  succeeds  (line  36).  The  other  branch  requires  us  to 
show  that  John  is  at  the  same  as  the  piece  of  paper  in  Wj  (line  37).  We  are  assuming  that 
if  John  is  at  the  same  place  as  the  piece  of  paper,  he  knows  he  is  at  the  same  place  as  the 
piece  of  paper,  so  this  goal  is  satisfied  if  John  is  at  the  same  place  as  the  piece  of  paper  in 
the  actual  world  (line  38).  This  is  also  one  of  the  premises,  so  this  branch  of  the  split  is  also 
satisfied  (line  39).  This  proves  line  33,  and  hence  produces  a  solution  to  line  32.  with  W2 
bound  to  F2(:John,:Ppr|,W|).  To  keep  formulas  short  in  the  rest  of  the  proof,  we  will 
abreviate  this  as  W2. 


40. 


41. 


42. 


43. 


44. 

45. 

46. 

47. 


48. 


49. 


R(Do(tJohn,Read{Pprj  )),Wj  ,W2)  Solved 

H(Wj  ,:Roads(:John))  RDla 

H(W|,:At(:John,:Ppr|))  ROla 

K(:John,W2,w3)/[W2  /  w3]  <->  R02 

Uw^(K(:John,Wj  ,w^)/[Wj  /  A 
(V(w^,:lnfo(:Pprj»  ■  V(Wj,:lnfo(:Pprj)))  A 
R(:Do(:John,:R«ad(:Pprj 

oV(W2,int.trmj )  «  V(W,  .inUrmj )  R03 

V(W2,:Comb(:Sf] ))  ■  :Cj  23 

•V(W2,inl.lrmj)  ■  V (W  j ,int  .trm  j  )/[int.trm j  i  tComb(:$f|)]  23 

V(W2,:ln(o(:Ppr,))>  fi(:C,)  24 

V(W2,int.trm, )  -  V(W,  .inl.trm, )/  24 

[(inMrmj  /  tComb(:Sf|)  A  (int.trm j  /  :ln<o(:Pprj)] 

H(W2,int.pj )  <«>  H(W|  ,int.pj )  RD3 


Since  we  have  just  solved  a  goal  which  matches  R(ovj,W|,w2),  we  assert  the  solution 
(line  40).  RDla  causes  us  to  assert  that  John  can  read  in  Wj  and  that  John  is  at  the  same 
place  as  the  piece  of  paper  in  W|  (lines  41  •  42).  RD2  triggers,  producing  a  description  of 

the  effect  of  reading  the  piece  of  paper  on  John’s  knowledge  (line  43).  RD3,  the  frame 
axiom  for  reading  also  triggers,  producing  two  assertions.  The  first  assertion  is  that  all 
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terms  in  W2  refer  to  the  same  objects  as  they  do  in  W|  (line  44).  However,  on  tines  23  and 
24,  we  have  assertions  specifying  the  referents  of  two  particular  terms  in  Wj,  the 
combination  of  the  safe,  and  the  information  written  on  Pprj.  In  section  6.4  we  described  a 
rule  for  equality  substitutions  where  the  term  being  substituted  for  is  more  general  than  the 
assertion  specifying  the  substitution.  Applying  this  rule  in  the  present  case  produces  an 
assertion  that  all  terms  in  W2  refer  to  the  same  objects  as  they  do  in  Wj,  but  with  syntactic 

restrictions  preventing  the  assertion  from  being  applied  to  the  terms  for  the  combination  of 
the  safe  and  the  information  written  on  the  piece  of  paper.  We  also  get  specific  assertion 
saying  that  the  information  written  on  the  piece  of  paper  in  W2  is  the  combination  :Cj 

(actually,  its  standard  name),  and  that  the  combination  of  the  safe  in  Wj  is  also  tCj  (lines  45 

•  48).  In  addition,  to  these  three  assertions,  RD3  produces  the  assertion  that  all  simple 
physical  conditions  are  the  same  in  W2  as  they  are  in  Wj  (line  49). 

50.  *T(W2(Can{»(0(W0,John)),Di*l(Comb(Sf|  ),S«|)t Op*n(SI|)M  Split 

51.  ■*T(W2,Know(o{D(W0rk>hn)),  Cl 

And(Eq(a(0(W2,0ial(Comb(Sf )  ),$f ,  ))),Dial(Comb(Sf ,  ),Sf , )), 
R«(Do(o(D(W2,John»1Dial(Comb(Sf ,  ),Sf ,  )),0pen(Slj ))))) 

52.  •*K(:John,W2,W3>  ->  K1.L 

T(W3,And(Eq{o(D(W2,Oial(Comb(Sf  j  ),Sf  j  ))),Dial(Comb(Sf  ]  >,$f  | )), 
Res(Do(o(0(W2lJohn)),Dial{Comb(Sf|  >,$f  j  )),0p«n(Sf  j )))) 

Now  we  try  to  prove  the  second  half  of  the  goal  on  line  31,  that  in  the  world  which  we 
have  just  shown  to  be  the  result  of  John  reading  the  piece  of  paper  in  Wj,  namely  W2. 

John  can  open  the  safe  by  dialing  the  combination  (line  50).  Cl  reduces  this  to  the  goal 
that  in  W2  John  knows  what  action  dialing  the  combination  of  the  safe  is.  and  he  knows 

that  dialing  the  combination  of  the  safe  would  result  in  the  safe  being  open  (line  51).  Kl 
reduces  this  to  showing  that  if  W3  is  a  typical  world  which  is  compatible  with  what  John 

knows  in  W2,  then  in  W3,  dialing  the  combination  of  the  safe  is  the  same  action  as  it  is  in 


W2  and  John  dialing  the  combination  of  the  safe  would  result  in  the  safe  being  open  (line 
52).  To  prove  this,  we  first  assert  the  antecedent,  and  then  try  to  prove  the  consequent 


53. 

KtsJohn,W2,W3) 

Ant* 

54. 

K(:John,W3,w3)/[W3  /  w3]  ->  K(sJohi»,W2,w3> 

K3 

55. 

■H(W3,:At(:John,x1 ))  v  -H{W2,:Al(:John,x1)) 

Al 

56. 

H(W3,:At(:John,Xj))  v  -H(W|,:At(:John,X|)) 

49 

57. 

•H(W3,:Raadi(:John))  V  ■>H(W2,:Reidt(tJohn#} 

RDSI 

58. 

H(W3,:Reads( John))  V  -H(Wj  ,:Raads(  John)) 

49 

Asserting  that  W3  is  compatible  with  what  John  knows  in  W2  triggers  a  number  of 
forward-chaining  rules,  including  K3,  Al,  and  RDSI  (lines  53,  55,  and  57).  Since  the 
formulas  asserted  by  A I  and  RDSI  both  mention  a  physical  condition  in  W2,  and  every 

physical  condition  in  W2  is  the  same  as  in  Wj,  The  occurrences  of  W2  are  replaced  by  Wj 
(lines  56  and  58). 


59. 

K(:John,W1,W4) 

43 

60. 

K(:John,W4,w3)/[W4  /  w3]  ->  K(:John,W,,w3) 

K3 

61. 

H(W4,:At(:Johnfxj ))  v  -H(W|,:At(John,x1)) 

Al 

62. 

H(W4,:R«ads(:John))  V  «H(Wj,:Reads(:John)) 

RDSI 

63. 

K(:John,W0,W4) 

19 

64. 

K(:John,W4,w3)/[W4  /  W3]  ->  K(!John,WQ,W3> 

K3 

65. 

H(W4,:AI(:John,xj))  v  -H(W0,:Ai(:John,Xj)) 

Al 

66. 

H(W4,:R«adt(:John))  v  'H(WQ,:Raad*(  John)) 

RDSI 

67. 

•T(W4,Exitt(7X  1  And(Eq(Comb(S(i  ),TX1  ),lnlo(Ppr ,  ,TX1 ))» 

10 

68. 

V(W4,:Comb(:$f|))  ■  :C4 

L 

69. 

V(W4,dnf«(tPprj))  ■  a(:C4) 

L 

The  assertion  KftJohn^.Wj)  on  line  53  also  triggers  the  assertion  on  line  43  which 
decribes  what  John  knows  in  W2.  This  results  in  assertions  that  there  is  a  world,  say  W4, 
which  is  compatible  with  what  John  knows  in  W]  (line  59)  in  which  the  information  written 
on  the  piece  of  paper  is  the  same  as  in  Wj  (line  70)  ,  and  in  which  the  result  of  John 
reading  the  piece  of  paper  is  W3  (line  76).  The  first  of  these  assertions  triggers  K3,  A  I,  and 
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RDS1  as  usual  (lines  60  •  62).  It  also  triggers  line  19  to  assert  that  W4  is  compatible  with 
what  John  knows  in  W0  (line  63).  This  is  the  first  time  we  have  actually  used  the  fact  that 
for  a  particular  knower,  K  is  transitive  (axiom  K3).  This  in  turn  triggers  K3,  A  I,  and 
RDSI  (lines  64  -  66).  It  also  triggers  line  10  to  assert  that  in  W4  the  combination  of  the 

safe,  represented  by  £4,  is  the  only  thing  written  on  the  piece  of  paper  (lines  67  •  69). 


70.  •V(W4,:lnfo(:Ppr1 ))  *  V(W,  ,slnfo(:Pprj ))  43 

7 1 .  *#(£4)  >  V(W,  ,:ln»o(:Ppr, ))  69 

72.  «ffl(£4)  »fi(:Cj)  24 

73.  tC 4  ■  :C|  L 

74.  V(W4,:Comb(:Sfj ))  ■  £j  68 

75.  V(W4,:lnfo(:Ppr|))  ■  B{£j)  69 

The  second  assertion  produced  by  line  43  is  that  the  information  written  on  the  piece  of 
paper  in  W4  is  the  same  as  the  information  written  on  the  piece  of  paper  in  Wj.  Since  the 
information  written  on  the  piece  of  paper  in  W4  is  the  standard  name  of  the  combination 
:C4,  and  in  W|  the  information  written  on  the  piece  of  paper  is  the  standard  name  of  the 
combination  :Cj,  we  conclude  that  £4  is  the  same  as  £j  (lines  71  •  73).  This  causes  us  to 
substitute  :Cj  for  £4  in  lines  68  and  69,  producing  assertions  that  the  combination  of  the 
safe  in  W4  is  £j,  and  the  information  wriiten  on  the  piece  of  paper  in  W4  is  the  standard 
name  of  £j  (lines  74  •  75).  This  gives  us  all  the  information  we  need  to  prove  that  in  the 
actual  world,  Wq,  John  knows  that  reading  the  piece  of  paper  would  result  in  his  knowing 
the  combination  of  the  safe. 


76. 

R(:Do(:John,:R««d<:Ppr ,  )>,W4,W3) 

43 

77. 

H(W4,:R«ads(:John)) 

RDIa 

78. 

H(W4,:AI(:John,:Ppr  | )) 

ROla 

79. 

K(tJohn,W3tW3)/[W3  f  w3)  <-> 

RD2 

3w4  (K(  John,W4  ,w4  )/[W4  /  W4)  A 

(V(w4,sln<o(:Ppr  j ))  ■  V(W4,:lnfo(:Ppri))>  A 
R(:Oo(:John,:R«ad(tPprj  )),w4,w3)) 

50.  •VCWj.int.trm] )  ■  V(W4,int.trmj )  R03 

51.  V(W3,tComb(iSf]))  >  74 

82.  •V(W3,int.trm|)  ■  V(W4,inMrm|)/[inUrmi  /  tCombltSf))]  74 

S3.  V(W3l:lnfo(:Pprt))»fi(:Ci)  75 

84.  V(W3,inUrm])- V(W4,int.trm1)/  75 

[(int.trm|  /  tComb(tSfj)  a  (ml.trmj  /  ilnfo(tPprj)} 

85.  H(W3,int.pj )  <■>  H(W4,int.p  j )  RD3 

86.  •H(W4,:At(:John,X| ))  v  'H(W, .tAlOJohigcj ))  56 

87.  *H(W4ljR«*dt(&John))  v  ’H(Wl,?R#«d»{Uohn))  58 


The  last  assertion  added  by  line  43  is  that  W3  is  the  result  of  John  reading  the  piece  of 
paper  in  W4.  This  triggers  Dla  to  assert  that  in  W4  John  can  read  and  he  is  at  the  same 
place  as  the  piece  of  paper  (lines  77  •  78).  It  also  triggers  RD2  to  produce  a  description  of 
the  effects  of  reading  the  piece  of  paper  on  what  John  knows  in  W3  (line  79).  Finally,  it 

triggers  the  frame  axiom  RD3  to  produce  a  description  of  the  physical  effects  of  John 
reading  the  piece  of  paper  in  W4.  The  first  part  of  this  description  is  the  assertion  that  all 

terms  refer  to  the  same  objects  in  W3  as  they  do  in  W4  (line  80).  Since  we  have  specific 
assertions  about  the  referents  of  the  terms  for  the  combination  of  the  safe  and  the 
information  written  on  the  piece  of  paper  in  W4,  we  syntactically  exclude  these  cases  from 

the  general  axiom,  and  explicitly  assert  that  the  information  written  on  the  piece  of  paper 
in  W3  is  the  standard  name  of  :C|  and  that  the  combination  of  the  safe  in  W3  is  tC|  (lines 
81  -  84).  The  other  assertion  produced  by  the  frame  axiom  is  that  all  simple  physical 
conditions  are  the  same  in  W3  as  they  are  in  W4  (line  85).  This  causes  the  references  in  line 

56  to  what  John  is  near  in  W3  and  in  line  58  to  John  being  able  to  read  in  W3  to  be 
replaced  by  these  same  conditions  in  W4  (lines  86  •  87).  This  transforms  these  assertions 
into  copies  of  lines  61  and  62,  so  they  are  deleted. 
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SI.  •*T(W3^nd(Eq(#(0(W2,Di«l(Comb(S(i  ),$f  j  ))),Dial(Comb(Sf,  ),Sf  j  )>,  Com* 

R«s(Do(o{D(W2,John)),Oi.l(Comb(Sf,  ).Sfj  )),0p*n(SI, )») 

89.  «*T(W3,E(i(a(O(W2,0i«l(Cofflb(SI)),$fI)))>Dial(Cemb(S(i),$f1)))  A  L 

T  (W3lR«s(Oo(o(D(W2,John)),Di«l(Comb(Sf  j  ),S(  j  )},0p«n(Sf  j ))) 

90.  •*T(W3,Eq(fi(0(W2,Dial{Comb(Sf, ), SI,  »>,Dial(Comb(Sf |  ).Sf j »)  Split 


91.  •*:Dial({V(W2,8Comb(:Sf|>),tStl)»  L 

:Oial(:V(W3,:Comb(:Sf|  , ) 

92.  •«(V(W2,K^mb(:S(1))>  V(W3,K^mb(:St1)))A  Eq 

(:Slj  >  :Sf  j ) 

93.  **V(W2,:Comb(:$f]))  ■  V(W3,:Comb(sStj))  Split 

94.  «*:C}  ■V(W3,^emb(sSft))  45 

95.  «*:C|  ■  :Cj  81 

96.  *T  Eq 

97.  -  tSfj  Split 

98.  *T  Eq 


Now  we  go  back  and  try  to  prove  the  consequent  of  line  52  from  the  information 
generated  by  asserting  the  antecedent  The  goal  is  to  show  that  in  W3,  dialing  the 

combination  of  the  safe  is  the  same  action  as  in  W2,  and  that  John  dialing  the  combination 
of  the  safe  would  result  in  the  safe  being  open  (lines  88  -  89).  We  split  this  goal  into  its  two 
subgoals.  The  first  subgoal  reduces  to  showing  that  the  combination  of  the  safe  is  the  same 
in  W2  as  it  is  in  W3,  and  that  the  safe  is  identical  to  itself  (lines  90  -  92).  This  goat  is  also 
split,  and  the  first  subgoal  is  solved  by  noting  that  the  combination  to  the  safe  in  both  W2 
and  W3  is  sCj  (lines  93  -  96).  This  is  basically  a  proof  that  in  the  actual  world,  W0,  John 
knows  that  reading  the  piece  of  paper  would  result  in  his  knowing  the  combination  of  the 
safe.  The  goal  that  the  safe  is  identical  to  itself  is  trivially  solved  by  the  simplification  rules 
for  identity  (lines  97  -  98). 

99.  •*T(W3lRes(Do(a(0(W2rJohn)),Dial(Comb(Sf |  ),S<j  )),0p«n{Sf j )))  Split 

100.  •*R(Do(:Johnl:Dial  (V  (W3,Comb(:Si  j  )),:$!  j  )),W3,wj)  A  R1,L 

T(w5,Opan(Sfj)) 

101.  •*R(Oo(:John,:Dial(V(W3,Comb(iSf|)),tSf1)),W3,ws)  Split 

102.  *R{Do(:Johfl,:Oiai(.>C|  ,sS<]  )),W3lW£)  81 

103.  **(V(w3,:Comb(:$f|)  ■  tCj)  A  Dlb 
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tS«f«(:S<j)  a  H(W3 ,:At (utohn,:S< j )) 


104. 

•«V(w3,:Comb(tS(|)  ■  tCj 

Split 

105. 

*:Cq  ■  £, 

12 

106. 

«*V(w3,:Comb(:Sfj) "  *Cj/ 

Cw3  w0] 

12 

107. 

*:Cj  a  KJj 

23 

108. 

*T 

E* 

109. 

*:Saf*(:Sf|) 

Split 

110. 

*T 

2 

111. 

•*H(W3,sA1(jJohn,:Sf  j )) 

Split 

112. 

*H(W^,:At(:John,:Sf  j )) 

85 

113. 

*H(Wj  ,:At(:John,:S<  j )) 

61 

114. 

*H(W0,:Ai(John,t$f| )) 

20 

115. 

*T 

4  ' 

The  second  subgoal  of  line  89  is  to  show  that  dialing  the  combination  of  the  safe  in  W3 
will  result  in  the  safe  being  open  (line  99).  This  reduces  to  showing  that  there  is  some 
world  which  is  the  result  of  John  dialing  the  combination  of  the  safe  in  W3  in  which  the 
safe  is  open  (line  100).  We  split  this  goal  into  two  subgoals,  first  trying  to  find  a  world 
which  is  the  result  of  John  dialing  the  combination  of  the  safe  in  W3  (line  101).  Since  the 
combination  of  the  Sf|  in  W3  is  known  to  be  tCj,  we  transform  the  goal  into  showing  that 
there  is  a  world  which  is  the  result  of  John  dialing  :C|  on  Sf|  in  W3  (line  102). 

Dlb  says  that  there  is  such  a  world  if  is  a  possible  combination  of  Sf j,  if  Sfj  is  a 
safe,  and  if  John  is  at  the  same  place  as  Sf  j  in  W3  (line  103).  We  split  this  goal,  and  first 
try  to  find  a  world  in  which  tCj  is  the  combination  of  Sfj  (line  104).  We  know  that  the 
combination  of  Sfj  in  Wq  is  iCq,  so  by  equality  substitution  we  first  try  showing  that  iCq  is 
the  same  as  tCj  (line  105).  We  have  no  rules  or  assertions  that  apply  to  this  goal,  so  it  fails. 
Next  we  try  to  show  that  sCj  is  the  combination  of  the  safe  in  W|,  and  this  succeeds  (lines 
106  -  108). 

The  second  subgoal  of  line  103,  showing  that  Sfj  is  a  safe,  is  immediately  satisfied  by 
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one  of  the  premises  of  the  problem  (lines  109  -1 10).  The  last  subgoal  of  line  103  is  to  show 
that  John  is  at  the  same  place  as  the  safe  in  W3  (line  1 1 1).  Since  all  physical  conditions  in 

W3  are  the  same  as  in  W4,  this  is  transformed  into  showing  that  John  is  at  the  same  place 
as  the  safe  in  W4  (line  112).  Since  W4  is  possible  according  to  what  John  knows  in  W|  and 
we  are  assuming  that  if  John  is  at  the  same  place  as  the  safe  he  knows  It,  we  can  solve  this 
goal  by  showing  that  John  is  at  the  same  place  as  the  safe  in  W|  (line  113).  A  similar 
argument  reduces  the  goal  to  showing  that  John  is  at  the  same  place  as  the  safe  in  W 0  (line 
114).  But  this  is  another  of  the  problem  premises,  so  the  goal  is  solved  (line  115).  This 
solves  all  of  the  subgoals  of  line  103,  so  line  102  is  also  solved  with  W5  bound  to 
F  |  (:John,:Cj  ,sSf  |  ,W3),  which  for  simplicity  we  will  abreviate  Wg. 


116.  R(Do(:John,:Di«l(:C,  ,:S<  j  )),W3,Wg)  Solved 

1 1 7.  V(Wg,:Comb(:S(j )  ■  .-Cj  D1 1 

118.  *:Saf«(:Sfj)  Dla 

1 1 9.  «H(W3,:Ai(:John,:Sf  | ))  Dla 

120.  H(W4,:At(:John,:Sf| ))  85 

121.  H(Wg,inl.pj )  <■>  02 

(((int.p,  ■  :0pen(:Sf, ))  a 

<(V(W3,:Comb(:Sf,»  ■  :C,)  v  H(W3,K)pan(:SI,»»  v 
((.ni  p,  /  :0pan(:Sf,))  A  H(W3,int.p, ))) 

1 22.  V(W5,int.lrm, )  >  V(W3,int.trm, ))  02 

123.  K(:John,W5,w3)/[W5 1  w3)  <->  03 

(3w4(K(jJohn,W3,w4)/(W3  /  w4]  A 
R(:Do(sJohn,sDial(jC,  ,:Sf ,  )),W4,w3))  A 
(H(w3,:0pan(:S(,»  <■>  H(W5,:0p«n(!$f,)))) 


Since  line  102  is  a  goal  of  the  form  R(av,,w,,W2),  we  assert  its  solution  (line  116).  This 
triggers  Dla  to  assert  that  :C,  is  a  possible  combination  of  Sfj,  that  SI,  is  a  safe,  and  that 
John  is  at  the  same  place  as  SI,  in  W3  (lines  117  •  119).  This  last  assertion  is  transformed 
into  the  assertion  that  John  is  at  the  same  place  as  the  safe  in  W4  (line  120).  Also,  D2 
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trigger*,  producing  a  description  of  the  physical  effects  of  John  dialing  tCj  on  Sfj  in  W3 

(lines  121  •  122),  and  D3  triggers,  producing  a  description  of  the  effects  of  the  action  on 
what  John  knows  in  W5  (line  123). 


124.  «*T(W2,0p«n(S(,))  Split 

125.  **H(WSl:Op«n(iSfj))  L 

1 26.  •*(:0p«n(:Sf  j )  ■  <0pen(tSf  j ))  a  121 

((V(W3,:Cemb(:$(,»  -  rCj)  v  H(W3,:Opan(:$f  j  »> 

127.  «*K)p«n(iS(j)  ■  iOpan(tSfj)  Split 

128.  *T  Eq 

129.  «*V(W3,:Comb(i$f|))  ■  tCj  Split 

130.  e*.*C|  ■  iCj  81 

131.  *T  Eq 


Finally  we  try  to  satisfy  the  other  subgoal  of  line  100,  using  the  solution  to  the  first 
subgoal.  This  gives  us  the  goal  of  showing  that  the  safe  is  open  in  W5  (lines  124  *  125). 

This  is  a  question  about  a  simple  physical  condition  in  Ws,  so  line  121  is  applied.  Line  121 
tells  us  that  if  we  want  to  show  that  the  safe  is  open  in  W5,  we  have  to  show  either  that  tCj 
is  the  combination  of  the  safe  in  W3>  or  that  the  safe  was  already  open  in  W3  (line  126). 

Whether  the  safe  is  open  is  the  question  we  are  interested  in  (lines  127  -  128),  so  we  try  to 
show  that  :C|  is  the  combination  of  the  safe  in  W3  (line  129).  But  we  already  know  that 

this  is  true,  so  the  goal  is  satisfied  (lines  130  •  I  SI).  This  was  the  last  case  we  had  to 
consider,  so  the  proof  is  complete. 


7.5  Remarks  on  the  Examples 

Following  three  rather  complex  examples  of  algorithmically  generated  deductions 
involving  knowledge  and  action,  some  general  remarks  are  in  order.  First  of  all,  it  is  rather 
surprising  that  such  intuitively  simple  problems  require  such  complicated  deductions.  It  is 
possible  that  our  formalism  is  more  complicated  than  it  should  be,  although  since  it  seems 
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possible  to  come  up  with  a  fairly  straightforward  example  which  turns  on  any  given  detail, 
this  seems  unlikely.  It  is  more  probable  that  common-sense  reasoning  is  more  complicated 
than  it  first  appears. 

Although  the  proofs  were  long  and  complex,  the  deduction  process  was  extremely  well 
controlled  with  virtually  no  blind  searching.  In  the  last  and  most  complicated  example,  of 
the  131  lines  generated,  93  were  actually  necessary  for  the  proof.  In  terms  of  percentages, 
7  It  of  the  lines  generated  were  used;  only  291  were  not  Another  measure  of  the  efficiency 
of  the  search  is  the  fact  that  54  formulas  were  transformed  into  other  formulas  or  deleted, 
as  soon  as  they  were  generated.  These  represent  cases  where  the  knowledge  is  built  into  the 
system  that  there  is  only  one  appropriate  inference  to  do.  Furthermore,  the  search  space 
was  finite.  If  at  the  last  moment  the  proof  had  failed,  it  would  have  soon  terminated 
anyway.  At  that  point,  there  was  only  one  alternative  left  to  try,  showing  that  the  safe  was 
already  open  before  the  dialing  action  took  place. 

This  efficiency  in  searching  for  a  proof  was  achieved  by  the  careful  structuring  that 
went  into  the  procedural  interpretations  for  the  axioms.  The  proofs  were  largely  driven  by 
forward  chaining.  In  the  last  example,  there  were  75  assertions  generated,  but  only  58  goals. 
Most  of  the  assertions  were  not  generated  by  blind  forward  chaining  from  the  premises, 
however.  Only  13  assertions  were  created  in  that  way.  The  remaining  assertions  were 
triggered  by  the  goals,  either  in  trying  to  prove  an  implication,  or  by  asserting  the  solution 
to  a  goal  of  the  form  R(«vj,W{,W2). 

The  fundamental  structure  of  these  proofs  is  that  the  goal  itself  triggers  a  large  number 
of  forward  inferences  which  describe  a  structure  of  possible  worlds,  and  the  goal  is  reduced 
to  some  fairly  simple  backward  inferences  involving  that  structure.  This  enables  us  to 
tightly  constrain  the  backward  searching.  Again,  in  the  last  example,  while  38  of  the  75 
assertions  generated  were  actually  used  in  the  proof,  57  of  the  58  goats  were  used.  This  is 


somewhat  misleading,  since  the  premises  of  the  problem  provided  only  Just  enough 
information  to  solve  the  problem,  but  the  really  complex  parts  of  the  problem  involving  the 
relations  among  possible  worlds  offered  plenty  of  possibilities  for  thrashing  if  not  treated 
correctly.  Even  the  forward  deductions  which  were  not  used  in  the  proof  mostly  represented 
inferences  that  were  reasonable  to  make.  Many  of  these  set  up  forward-chaining  rules  that 
would  have  been  triggered  if  the  goals  had  been  still  more  complicated. 

There  seems  to  be  only  one  way  in  which  these  methods  create  a  possibility  of 
generating  large  numbers  of  unnecessary  inferences.  That  way  involves  assertions  about 
what  someone  knows  that  are  not  required  for  the  problem  at  hand.  These  assertions 
would  be  represented  as  forward-chaining  rules  of  the  form  (KtA.Wj.w^  •>  T(w2>Pj).  If  we 
have  a  lot  of  information  about  what  A  knows,  and  hence  a  large  number  of  Pj’s,  then 
whenever  we  want  to  deduce  that  A  knows  something,  we  will  assert  KtA.Wj.Wj)  and  be 
inundated  by  assertions  of  the  form  T(W2,Pj).  This  problem  is  particularly  severe  in  the 

case  of  axioms  like  A I  and  RDSI,  which  assert  something  about  what  everyone  knows. 
The  only  alternative  to  this  within  the  present  framework  would  be  to  represent  the 
statements  about  what  people  know  as  backward-chaining  rules,  but  this  seems  to  be  ruled 
out  for  the  reasons  discussed  in  section  6.5. 

One  possible  way  out  of  this  problem  would  be  to  introduce  a  new  kind  of  syntactic 
restriction  into  the  pattern  matching  routine,  so  that  the  pattern  H(w2,:P)/[K(A,W|  ,w2>] 
matches  the  pattern  H(W2,sP)  if  and  only  if  K(A,Wj,W2)  is  asserted.  That  way.  If  K(A,Wj,W2) 
is  asserted,  then  the  fact  that  A  knows  P  will  match  the  pattern  H(W2,»P),  without  any 
explicit  new  formulas  being  generated.  For  this  to  work  really  efficiently  the  indexer  for 
the  data  base  should  take  these  restrictions  into  account,  so  that  K(A,W|,W2)  will  be  checked 

before  looking  at  any  of  the  assertions  about  what  A  knows.  It  appears  that  this  could  be 


done  without  too  much  trouble.  The  implications  of  this  approach  need  to  be  looked  at  in 
more  detail,  to  take  into  account  all  the  various  possibilities  of  pattern  matching,  but  the 
idea  looks  promising. 


8.  Summary  and  Conclusions 


8.1  What  has  been  Achieved? 

In  chapter  I.  the  goal  of  this  thesis  was  stated  to  be  the  development  a  formalism  which 
(i)  takes  into  account  the  important  role  of  the  agent's  knowledge  in  planning  and  acting 
and  (ii>  permits  reasonably  efficient  automatic  deduction.  I  believe  that  the  formalism 
presented  here  achieves  that  goal  The  most  important  ideas  which  were  used  in  bringing 
this  about  appear  to  be  the  following: 

(1)  Rather  than  reason  directly  about  what  facts  someone  knows,  we  can  gain 
efficiency  by  reasoning  instead  about  what  possible  worlds  are  compatible  with 
what  he  knows. 

The  first  problem  that  we  faced  in  reasoning  about  knowledge  was  that,  while  the  basic 
facts  about  knowing  are  most  easily  expressed  in  a  modal  logic,  there  are  no  known 
techniques  for  efficiently  searching  for  proofs  in  such  logics.  The  solution  to  this  problem 
which  has  been  pursued  in  the  thesis  is  to  translate  statements  expressed  in  the  modal  logic 
of  knowledge  into  a  language  which  talks  about  possible  worlds,  where  the  reasoning  can  be 
carried  out  without  the  use  of  modal  operators.  While  this  idea  is  not  original  in  itself,  the 
realization  that  the  possible-world  approach  could  lead  to  more  efficient  proof  methods  than 
the  known  alternatives  seems  not  to  have  been  made  before.  The  reason  for  this  efficiency, 
as  we  pointed  out  in  section  2-3,  is  that  the  possible-world  approach  permits  the  standard 
logical  operators  to  be  lifted  directly  into  the  first-order  meta-language  where  they  can  be 
operated  on  using  standard  deduction  methods.  The  inefficiencies  that  result  from  the  lack 
of  this  feature  in  other  approaches  were  analyzed  in  sections  2.2  and  2.6. 

(2)  We  have  worked  out  the  details  of  formalizing  the  semantics  of  a  fully 
quantified  logic  of  knowledge  and  action  in  a  first-order  meta-language. 


There  have  been  other  formalizations  of  the  possible-world  semantics  for  the 
propositional  logic  of  knowledge  carried  out  in  first  order-logic,  but  ours  appears  to  be  the 
first  to  successfully  work  out  the  important  problems  of  handling  quantifiers  and  equality. 
This  extension  of  the  previous  work  was  essential  to  our  overall  theory  of  knowledge  and 
action,  which  depended  heavily  on  handling  quantifying  into  knowledge  contexts  correctly. 

(3)  We  axiomatize  the  syntax  of  the  modal  logic  of  knowledge  and  action  and  its 
possible-world  semantics  within  a  single  first-order  theory. 

Rather  than  axiomatizing  only  the  possible-world  language  for  knowledge  and  action, 
we  also  axiomatize  the  interpretation  of  the  modal  logic  of  knowledge  in  that  language. 
This  allows  us  to  state  problems  in  the  more  compact  and  more  direct  modal  object 
language,  while  reasoning  in  the  possible-world  meta-language,  and  to  formulate  concepts 
using  the  object  language  which  are  difficult  or  impossible  to  represent  using  the  meta¬ 
language  alone  (e.g„  Cm).  Again,  this  general  technique  was  borrowed  from  elsewhere 
(McCarthy,  1975),  but  the  idea  of  using  the  object-language  to  gain  conceptual  power  seems 
to  be  new. 

(4)  We  integrate  the  logic  of  knowledge  with  the  logic  of  actions  by  identifying 
possible  worlds  in  the  logic  of  knowledge  with  situations  in  the  logic  of  actions. 

To  integrate  the  logic  of  knowledge  with  a  logic  of  actions,  the  logic  of  actions  should 
also  be  expressed  in  terms  of  possible  worlds.  In  A I  the  standard  way  of  looking  at  an 
action  is  as  a  binary  relation  on  states  of  the  world,  or  situations.  But  this  already  is 
formally  a  possible-world  theory  of  actions.  We  make  the  integration  complete  by 
identifying  situations  in  the  logic  of  actions  with  possible  worlds  in  the  logic  of  knowledge. 
This  is  a  nonstandard  interpretation  of  possible  worlds,  but  it  turns  out  to  be  more  flexible 
than  the  usual  approach  and  it  enables  us  to  state  very  easily  the  way  actions  affect  what 
the  agent  knows. 
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(5)  We  analyze  the  knowledge  preconditions  for  actions  in  terms  of  knowing  what 
action  to  perform,  and  we  describe  the  effects  of  actions  on  knowledge  in  terms 
of  patterns  of  relationships  among  possible-worlds. 

These  are  the  key  theoretical  contributions  of  this  thesis.  Both  these  ideas  seem  to  be 
entirely  original.  As  shown  in  chapter  S,  together  they  enable  us  to  minimize  the  amount  of 
problem-specific  knowledge  that  must  be  used  to  make  the  inferences  we  want  to  make 
about  the  interaction  of  knowledge  and  action.  An  example  of  this  is  how  the  notion  of  a 
test  falls  out  as  a  special  case  of  our  general  theory  of  knowledge  and  action. 

(6)  We  use  domain-specific  control  information  to  help  produce  efficient  solutions  to 
problems. 

The  logic  of  knowledge  and  action  used  in  this  thesis  is  a  complex  axiomatization  on  an 
infinite  domain  of  possible  worlds.  There  are  numerous  possibilities  for  generating  fruitless 
infinite  searches  in  attempting  to  do  automatic  deductions  in  this  formalism.  By  carefully 
controlling  the  way  the  axioms  of  the  theory  are  used,  we  have  been  able  to  restrict  the 
search  in  typical  problems  to  a  well-behaved  finite  space.  While  most  of  the  techniques  we 
use  are  not  new,  our  sample  problems  are  some  of  the  most  complex  to  which  they  have 
ever  been  applied,  so  our  positive  results  represent  encouraging  evidence  for  the  usefulness 
of  these  techniques. 

These  ideas  solve  many  of  the  problems  of  reasoning  about  knowledge  and  action,  but 
there  are  other  questions  in  this  area  that  we  have  left  untouched.  In  the  next  section  we 
will  examine  some  of  the  limitations  of  the  current  approach,  and  we  will  conclude  by  trying 
to  place  this  piece  of  work  in  the  context  of  the  overall  goals  of  A I. 


8.2  Limitations  and  Extensions  of  the  Current  Approach 

The  approach  to  reasoning  about  knowledge  and  action  presented  in  this  thesis  has  a 
number  of  limitations.  Some  of  these  are  limitations  of  the  logic  of  knowledge  and  action; 
others  involve  the  procedural  ideas  for  generating  deductions. 

One  way  of  improving  the  logic  of  knowledge  would  be  to  make  it  more  in  agreement 
with  "common-sense  psychology",  that  is,  make  it  closer  to  the  way  people  usually  describe 
the  reasoning  processes  of  others.  A  serious  treatment  of  the  issues  of  plausible  reasoning 
raised  in  chapter  2  would  be  a  major  improvement  For  instance,  it  would  be  nice  to  be 
able  to  reason  that  although  the  laws  of  arithmetic  imply  that  every  positive  integer  is  the 
sum  of  four  squares,  if  John  does  not  know  much  mathematics,  we  shouldn’t  assume  that  he 
knows  this  fact  even  if  he  knows  the  laws  of  arithmetic  Formalising  this  reasoning  would 
require,  among  other  things,  specifying  what  inferences  are  "about”  mathematics. 

A  general  problem  here  is  that  the  possible-world  approach  makes  it  difficult  to  specify 
exactly  what  inference  a  person  fails  to  make.  Suppose  that  John  knows  that  P,  that  (P  » 
Q),  and  that  (Q  a  R).  Suppose  that  we  also  know  that  John  is  likely  not  to  notice  that  R  is 
true  even  if  he  knows  that  Q  is  true.  In  the  possible-worlds  formalism,  however,  the  only 
place  that  we  can  block  the  inference  that  John  knows  that  R  is  true,  is  going  from  the  fact 
that  R  is  true  in  every  world  which  is  compatible  with  what  John  knows,  to  the  conclusion 
that  John  knows  that  R  is  true.  The  step  that  really  corresponds  to  the  inference  which 
John  fails  to  make  is  that  if  (Q  9  R)  and  Q  are  true  in  every  world  which  is  compatible  with 
what  John  knows,  then  R  is  true  in  every  world  which  is  compatible  with  what  John  knows. 
But  this  inference  can  not  be  blocked  by  the  logic,  because  it  is  perfectly  valid.  One  might 
say  that  the  inferences  that  John  does  not  do  get  "stacked  up"  while  reasoning  in  the 
possible- world  domain,  and  they  all  have  to  be  cashed  in,  in  the  single  step  of  going  from 
what  is  possible  according  to  what  he  knows  to  what  he  actually  concludes.  Formalizing 
this  seems  likely  to  be  difficult 
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We  can  handle  some  of  these  problems  if  the  reasoning  processes  we  wish  to  describe 
are  expressible  in  terms  of  the  procedural  interpretations  which  we  give  to  formulas.  For 
example,  in  chapter  6  it  was  shown  how  Know(John,(P  ■>  Q))  is  processed  so  that  (P  »>  Q)  gets 
its  usual  procedural  interpretation  in  the  context  of  what  John  knows.  Suppose  then  that 
the  reason  that  John  does  not  infer  R,  even  though  he  knows  P,  (P  a  Q),  and  (Q  a  R),  is  that 
he  uses  (P  a  Q)  as  a  backward-chaining  rule,  e.g.  (Q  <■  P),  and  he  uses  (Q  a  R)  as  a  forward¬ 
chaining  rule,  e.g.  (Q  ■>  R).  If  this  were  the  case,  John  would  not  be  able  to  infer  R  from  P, 
because  both  rules  are  triggered  by  the  intermediate  assertion  Q.  which  never  gets  generated. 
We  can  simulate  this  by  making  the  assertions  Know(John,(Q  <■  P»  and  Know(John,(Q  »>  R)). 
These  assertions  would  not  generate  a  deduction  of  the  goal  R  from  the  premise  P,  not 
because  the  logical  interpretation  blocks  the  inference,  but  because  the  procedural 
interpretation  does.  This  is  about  as  close  as  we  can  come  to  reasoning  about  what  someone 
knows  by  simulating  his  reasoning. 

This  idea  seems  fairly  promising  for  many  applications.  Ironically,  where  it  most 
obviously  fails  is  in  reasoning  about  what  someone  knows  about  knowledge,  that  is,  where 
we  would  have  two  or  more  nested  applications  of  the  modal  operator  Know.  The  problem 
is  that  the  possible-world  theory  of  knowledge  is  a  much  more  powerful  method  of 
reasoning  about  what  people  know  than  the  methods  people  themselves  seem  to  use.  That 
is  all  right  when  we  are  trying  to  reason  about  what  John  knows  about  blocks,  but  leads  to 
problems  when  we  try  to  reason  about  what  John  knows  about  what  Bill  knows  about 
blocks.  The  difficulty  is  that  we  pretend  that  John  also  uses  the  possible-world  theory  of 
knowledge  in  reasoning  about  what  Bill  knows.  By  doing  this  we  run  the  risk  that  we  may 
credit  John  with  much  better  abilities  to  reason  about  what  Bill  knows  than  John  actually 
uses. 

To  be  more  specific,  most  people  seem  to  have  little  trouble  with  the  inference  that  if 


John  knows  that  Bill  knows  that  (P  a  Q)  and  John  knows  that  Bill  knows  that  P,  then  John 
knows  that  Bill  knows  that  Q.  (This  is,  of  course,  a  plausible  inference  based  on  the 
assumption  that  people  generally  know  the  consequences  of  their  knowledge.)  But  the  same 
assumptions  that  lead  to  this  inference  lead  to  the  conclusion  that  if  John  knows  that  Bill 
knows  that  (P  a  Q)  and  John  doesn’t  know  that  Bill  doesn’t  know  that  P  is  false,  then  John 
doesn’t  know  that  Bill  doesn't  know  that  Q  is  false.  This  inference  seems  to  be  not  obvious 
at  all  to  most  people,  yet  in  the  possible- world  theory,  it  is  of  no  greater  complexity  than  the 
first  inference. 

The  trouble  is  that,  given  that  John  knows  (P  a  Q),  people  seem  much  beKfftt 
reasoning  that  if  John  knows  that  P,  then  he  probabjyJtnowrfKat  Q,  than  at  reasoning  that 
if  John  doesn’t  know  that  Q,  then  he  probabl^doesn’t  know  that  P.  The  second  rule  is 
simply  the  contrapositive  of  the  first,  and  is  therefore  logically  equivalent  to  it  The  second 
inference  in  the  previous  paragraph  requires  two  applications  of  this  principle,  one 
application  being  to  the  axiom  which  expresses  the  principle.  This  self-application  of  an 
already  difficult  principle  of  reasoning  is  what  makes  that  inference  so  obscure.  My 
intuition  is  that  if  we  allowed  this  principle  to  be  applied  where  P  and  Q  involve  only 
nonintensional  concepts  (i.e.  operators  that  are  not  explained  in  terms  of  possible  worlds), 
then  we  would  have  a  more  reasonable  model  of  the  reasoning  ability  of  people.  I  believe 
that  this  could  be  imposed  on  top  of  the  possible-world  theory  of  knowledge  by  syntactic 
restrictions  of  the  type  we  have  been  using,  but  the  details  of  this  remain  to  be  worked  out. 
Probably,  however,  there  ought  to  be  a  search  for  a  different  approach  in  which  this 
restriction  would  be  more  natural. 

Another  problem  is  that  simply  saying  that  John  knows  that  P(A)  does  not  adequately 
characterize  John’s  knowledge.  It  does  not  distinguish  the  case  where  John  is  able  to 
answer  the  question  "Is  P(A)  true?"  from  the  case  where  he  can  supply  A  as  an  answer  to  a 
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request  to  name  something  that  has  property  P.  (This  distinction  was  pointed  out  to  me  by 
John  McCarthy.)  If  the  property  P  is  being  the  solution  to  some  high  order  polynomial 
equation,  the  difference  between  the  two  is  vast  The  first  interpretation  requires  only  that 
John  have  some  very  simple  knowledge  of  elementary  algebra,  so  that  he  can  plug  in  the 
proposed  solution  to  see  whether  it  works.  The  second  interpretation  might  require  John  to 
have  very  sophisticated  skills  in  algebraic  manipulation.  Neither  the  possible-work! 
approach  nor  the  modal  logic  of  knowledge  takes  account  of  this  distinction. 

A  related  distinction  which  we  might  wish  to  draw  is  the  difference  between  what 
someone  explicitly  knows  and  what  he  can  deduce  from  his  knowledge.  For  example,  most 
people  would  explicitly  know  that  2  is  an  even  number,  but  not  that  38194604  is  an  even 
number.  Most  people  do  "know''  that  this  number  is  even,  however,  in  the  sense  that  they 
can  readily  deduce  that  it  is  even  from  the  fact  that  the  last  digit  is  4.  This  distinction 
would  certainly  be  part  of  a  more  detailed  theory  of  knowledge,  but  ignoring  it  does  not 
seem  to  produce  any  striking  anomalies,  as  does  ignoring  the  distinction  made  in  the 
previous  paragraph. 

Our  system  also  has  certain  limitations  in  its  fundamental  logical  power.  Some  of  these 
derive  from  basing  our  system  on  modal  logic  The  key  fact  here  is  that  from  the  point  of 
view  of  the  object  language,  Know  is  an  operator  which  is  applied  to  a  term  denoting  a 
possible  knower  and  a  formula  which  expresses  a  fact  that  he  knows.  An  alternative 
approach  would  be  to  make  Know  a  predicate  which  applies  to  a  term  denoting  a  possible 
knower  and  a  term  denoting  a  formula.  With  the  modal  logic  we  have  to  be  specific  about 
what  someone  knows;  making  Know  a  predicate  (usually  called  the  Syntactic"  approach 
(Montague,  1963))  would  allow  much  greater  flexibility.  For  instance,  currently  we  cannot 
express  something  like  "John  knows  what  Bill  said,'  in  the  object  language,  because  the 
English  phrase  'what  Bill  said"  cannot  be  represented  by  a  formula.  If  we  know  that  Bill 


said  that  all  crows  are  black,  we  could  express  the  fact  that  John  knows  that  Bill  said  that 
all  crows  are  black,  but  the  more  direct  statement  that  John  knows  what  Bill  said  cannot  be 
made.  With  the  syntactic  approach,  there  would  be  no  reason  in  principle  why  "what  Bill 
said"  could  not  be  represented  as  a  term  denoting  a  formula. 

The  main  reason  that  modal  logics  are  generally  favored  over  syntactic  methods, 
however,  is  that  there  are  severe  difficulties  in  formalizing  the  syntactic  approach. 
Montague  (1963)  has  proved  that  syntactic  treatments  of  modal  concepts  which  have  certain 
rather  general  (and  superficially  desirable)  properties  are  in  fact  inconsistent  Specifically, 
any  syntactic  treatment  of  a  modal  logic  is  inconsistent  if  it  has  axioms  corresponding  to 
Ml,  M2,  Ml,  and  M5,  and  if  it  has  a  finite  set  of  axioms  which  allow  all  recursive 
functions  on  names  of  formulas  to  be  represented.  Such  theories  are  inconsistent  because 
they  allow  the  formation  of  self-referential,  paradoxical  sentences.  The  simplest  example  of 
this  is  attempting  to  syntactically  axiomatize  the  notion  of  truth.  If  this  is  done  in  a  theory 
that  meets  Montague's  conditions,  then  there  will  be  at  least  one  term  Expj  which  demotes 

the  formula  "True (Exp j ).  This  formula,  then,  asserts  its  own  falsehood.  A  similar,  although 

more  complex,  construction  can  be  carried  out  for  syntactic  treatments  of  modalities  such  as 
Know.  (See  Montague  (1963).) 

One  might  attempt  to  restrict  the  language  so  that  self-referential  statements  cannot  be 
formed.  Kripke  (1975)  points  out,  though,  that  whether  a  statement  is  self-referential  is 
often  a  matter  of  empirical  fact  rather  than  a  matter  of  form.  To  take  the  simpler  case  of 
the  predicate  Truo,  suppose  that  on  a  certain  day,  John  makes  the  prediction  "Everything 
Bill  says  today  will  be  true,"  and  says  nothing  else  all  day.  If  all  of  Bill's  statements  on  that 
day  have  determinate  truth  values,  then  there  will  be  no  problem  assigning  a  truth  value  to 
John's  prediction.  Suppose,  however,  that  the  only  thing  that  Bill  says  on  that  day  is 
"Everything  John  says  today  will  be  false."  In  this  case,  John’s  statement  is  true  if  and  only 


if  Bill's  statement  is  true,  but  Bill's  statement  is  true  if  and  only  if  John’s  statement  is  false. 
Thus  we  have  a  paradox.  The  point  is  that  the  paradoxical  nature  of  these  statements 
depends  on  their  being  taken  together.  Most  of  the  time  they  can  be  used  Independently  to 
make  perfectly  reasonable  assertions.  Any  attempt  to  restrict  the  form  of  such  sentences  will 
have  to  rule  out  sentences  that  are  all  right  most  of  the  time.  It  seems  likely  that  similar 
considerations  will  apply  in  the  more  complicated  case  of  Know. 

Kripke’s  solution  to  this  problem  is  not  to  restrict  the  language,  but  rather  to  define  the 
semantics  of  the  language  so  that  statements  or  sets  of  statements  that  are  self-referential  are 
not  assigned  a  truth  value.  How  this  is  done  for  True  is  explained  in  Kripke  (1975).  These 
techniques  would  seem  to  be  applicable  to  Know  as  well. 

Another  extension  to  the  logical  power  of  the  formalism  would  be  to  allow  the  system  to 
reason  about  its  own  knowledge.  This  raises  issues  which  are  surprisingly  quite  different 
from  reasoning  about  the  knowledge  of  others.  For  example,  if  the  system  uses  I  to  refer  to 
itself  and  w0  to  refer  to  the  current  situation,  then  all  true  statements  of  the  form 
T(W0,Know(l,P))  are  recursively  enumerable  for  the  system.  Furthermore,  if  the  underlying 
logic  is  decidable,  all  statements  of  the  form  T(WQ,Know(l,P»  should  be  decidable.  If  this  is 
the  case,  it  makes  no  sense  to  have  any  explicit  assertions  of  the  form  T(W0,Know(l,P))  or 
-T(W0,Know(l,P)).  Any  assertion  of  this  form  will  either  be  implied  or  contradicted  by  an 
assertion  already  implicit  in  the  system.  In  fact,  the  most  direct  way  to  implement  reasoning 
about  the  system’s  own  knowledge  is  as  a  recursive  call  to  the  deductive  routines  to  evaluate 
any  expressions  in  this  form. 

Things  get  a  little  more  complicated  if  we  allow  quantifying  into  such  expressions.  For 
instance,  suppose  we  want  to  tell  the  system  that  it  knows  everything  that  has  property  P. 
This  could  be  expressed  formally  as  T(W0,AII(rXl,(P(TXI)  ■>  Know(l,P(TX))))).  One 
interpretation  of  this  formula  should  be  that  in  order  to  prove  that  P(A)  is  false,  prove  that 


A  is  not  one  of  the  objects  which  the  system  is  able  to  deduce  has  property  P.  (Recall  that 
deducing  that  an  object  has  property  P  means  deducing  P(B),  where  B  is  a  rigid  designator 
for  the  object.) 

At  present  it  is  not  entirely  clear  to  me  how  to  implement  this  extension  to  our 
formalism.  One  particular  problem  is  that  this  interpretation  of  T(WQlKnow(llP))  makes  the 

system  "non-monotomc"  in  Minsky’s  (1974)  phrase.  That  is,  we  can  have  a  theory  with  the 
property  that  adding  axioms  causes  some  statements  that  were  previously  theorems  to  be 
non-theorems.  The  preceding  paragraph  provides  a  good  example  of  this.  Suppose  that  A, 
B,  and  C  are  the  only  objects  that  we  can  explicitly  prove  have  property  P.  Then  if  we 
know  that  D  is  not  the  same  as  A,  B,  or  C,  and  we  know  that  we  know  everything  that  has 
property  P,  we  would  want  to  be  able  to  prove  -P(D).  But  if  we  explicitly  add  P(D)  as  a 
theorem,  -P(D)  should  no  longer  be  provable.  It  is  well  known  that  this  type  of  reasoning 
can  create  problems  (Sandewall,  1972).  For  example,  if  P  is  asserted  to  be  true  whenever  Q 
is  not  deducible,  and  Q  is  asserted  to  be  true  whenever  P  is  not  deducible,  then  to  be 
consistent  we  must  regard  either  P  or  0  as  deducible,  although  we  may  have  no  basis  to 
choose  one  over  the  other.  This  is  reminiscent  of  the  self-reference  problems  which  we 
discussed  above,  and  Kripke’s  techniques  may  be  of  use  here  as  well. 

There  are  also  some  interesting  problems  to  be  considered  which  relate  to  making 
deductions  about  lack  of  knowledge.  For  instance,  as  this  is  being  written,  I  am  quite 
certain  that  nothing  that  I  know  would  tell  me  whether  the  President  is  sitting  down  at  this 
moment.  It  is  clear,  however,  that  I  did  not  come  to  this  conclusion  by  exploring  all  the 
consequences  of  everything  I  know.  Rather,  it  seems  as  though  I  have  partitioned  my 
knowledge  into  independent  subsets,  and  I  find  there  is  no  information  in  the  subset  that 
would  contain  statements  about  the  President's  postural  position.  (Once  again,  the  credit  for 
recognizing  this  problem  belongs  to  John  McCarthy.) 
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Some  work  on  this  type  of  problem  is  reported  in  in  unpublished  paper  by  Goad 
(1976).  The  basic  idea  is  this;  If  we  have  set  of  formulas  that  "span"  John’s  knowledge  (i.e. 
everything  he  knows  is  derivable  from  those  formulas),  we  can  prove  that  he  doesn't  know 
p  by  proving  that  there  is  some  possible  world  in  which  all  the  formulas  of  the  knowledge 
set  plus  *P  are  true.  This  approach  has  two  major  difficulties.  First  it  requires  a  complete 
description  of  what  worlds  are  a  priori  possible.  A  potential  solution  to  this  problem  is  to 
use  the  techniques  dicussed  above  to  say  that  there  is  a  possible  world  that  fits  a  given 
description  unless  there  is  a  proof  that  no  such  world  exists.  The  second  and  more  serious 
difficulty,  though,  is  that  Goad’s  proposal  gives  us  no  help  with  the  partitioning  problem. 
What  we  need  is  a  way  of  saying  that  such-and-such  is  all  that  John  knows  about  P,  so  that 
we  restrict  our  attention  to  the  relevant  facts.  It  is  not  at  all  obvious  how  this  notion  of 
"about"  can  be  captured. 

In  addition  to  the  limitations  of  our  logic  of  knowledge,  It  should  be  pointed  out  that  we 
do  not  even  pretend  to  attack  the  serious  limitations  of  the  situation  calculus  approach  to 
describing  actions.  The  most  obvious  such  limitation  Is  the  inability  to  reason  about 
concurrent  actions.  We  also  have  avoided  the  problem  of  actions  being  continuous 
processes  rather  than  discrete  steps.  A  third  problem  would  be  representing  action 
modifiers,  e.g.  relating  dialing  the  combination  of  the  safe  to  dialing  the  combination  of  the 
safe  carefully  or  hurriedly  or  left-handed,  etc  A  really  adequate  logic  of  actions  would 
have  to  solve  all  these  problems  and  probably  many  more. 

Finally,  there  are  the  limitations  of  the  procedural  techniques  we  have  used  in 
generating  deductions.  There  is  much  less  to  be  said  about  this  than  about  the 
representational  issues,  not  because  there  are  fewer  problems,  but  because  the  problems  are 
much  less  well  understood.  One  observation  is  that  we  have  not  presented  anything  like  a 
coherent  strategy  for  doing  deductions  in  this  domain.  Instead,  we  looked  microscopically  at 


individual  axioms  to  see  how  they  would  behave  if  used  in  certain  ways.  This  gives  us  no 
guarantee  that  we  have  not  overlooked  some  major  problem,  or  that  if  we  attempt  to  extend 
the  system,  things  will  not  get  completely  out  of  control.  This  is  largely  a  consequence  of  the 
fact  that  AI  has  produced  no  real  theory  of  how  to  control  deductive  processes,  only  a  large 
number  of  examples  of  what  will  or  will  not  work  in  particular  cases.  Very  recently  some 
serious  work  has  begun  on  including  explicit  control  axioms  in  deductive  systems 
(McDermott,  1976)  (de  Kleer  et  al„  1977)  (Doyle,  1978)  (McAllester,  1978).  This  may 
provide  superior  ways  of  supplying  the  control  information  that  is  needed  by  our  formalism 
and  should  be  investigated  further. 

Two  more  points  about  these  problems:  First,  one  possible  objection  to  the  way  we  have 
embedded  heuristic  knowledge  in  the  axioms  of  our  system  is  that  although  our  goal  was  to 
make  any  such  knowledge  applicable  over  the  entire  problem  domain,  we  have  actually  put 
it  into  specific  axioms  about  specific  actions.  Although  this  is  true,  it  should  be  pointed  out 
that  all  of  the  axioms  describing  actions  were  in  very  stereotyped  forms.  It  does  not  appear 
to  be  difficult  for  the  system  to  accept  one  of  these  axioms  in  a  neutral  form,  recognize  what 
type  of  axiom  it  is,  and  automatically  add  the  required  heuristic  information. 

Second,  we  pointed  out  in  section  7.1  that  the  procedural  interpretations  which  were 
given  to  the  axioms  describing  actions  are  strongly  biased  towards  checking  the  effects  of  a 
given  action,  as  opposed  to  finding  an  action  which  will  produce  desired  effects.  This 
leaves  completely  open  the  problem  of  generating  plans  which  involve  the  acquisition  of 
knowledge.  A  major  worry  here  is  that  in  our  formalism  it  might  be  neccessary  to  search 
an  infinite  set  of  possible  worlds  to  do  plan  generation.  The  deductive  techniques  we  have 
developed  so  far  depend  on  the  search  space  being  finite.  Perhaps  one  way  out  of  this 
problem  would  be  to  propose  possible  plans  using  some  weaker  formalism  that  does  not  talk 
about  possible  worlds,  and  then  test  the  proposed  plans  in  the  more  rigorous  possible-world 
formalism.  Whatever  the  case,  this  looks  like  a  very  rich  area  for  further  research. 
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8.S  Conclusions 

In  their  classic  (1969)  paper,  McCarthy  and  Hayes  define  three  standards  of  adequacy 
for  representations  of  knowledge.  The  first  standard  is  called  metaphysical  adequacy.  A 
representation  is  metaphysically  adequate  if  every  aspect  of  reality  has  a  description  in 
terms  of  the  representation.  This  seems  to  be  what  Laplace  had  in  mind  when  he  asserted 
that  given  the  position  and  velocity  of  every  particle  in  the  universe  and  all  the  forces 
acting  on  them,  he  could  predict  exactly  the  future  history  of  the  universe.  A  modern 
analogue  of  this  representation  might  be  the  quantum  mechanical  wave  equation  for  the 
entire  universe. 

The  trouble  with  representations  such  as  these  is  that  they  cannot  be  used  in  any 
practical  way  to  represent  the  knowledge  that  an  intelligent  being  actually  has  about  the 
world.  Representations  that  can  be  used  in  this  way  are  called  epistemologically  adequate. 
This  is  the  standard  of  adequacy  to  which  formal  logic  directs  itself.  Finally,  a 
representation  system  is  heuristically  adequate  to  the  extent  that  it  can  represent  knowledge 
about  how  to  solve  problems  involving  those  aspects  of  the  world  represented  in  the  system. 

It  seems  clear  that  epistemological  and  heuristic  adequacy  are  the  twin  standards  by 
which  work  in  Artificial  Intelligence  must  be  judged.  Moreover,  I  believe  that  these  two 
goals  cannot  be  pursued  Independently.  Representation  systems  may  display  a  remarkable 
degree  of  epistemological  adequacy  without  there  being  any  indication  of  how  they  can  be 
used  in  a  practical  way  to  do  reasoning.  The  modal  logic  of  knowledge  discussed  in  chapter 
2  seems  fit  this  description.  On  the  other  hand,  heuristic  methods  that  work  for 
representation  systems  of  limited  descriptive  power  may  be  of  little  use  in  richer  systems.  As 
we  pointed  out  in  chapter  6,  the  methods  used  in  PLANNER  and  related  languages  run 
into  difficulties  if  they  are  applied  to  systems  that  permit  incomplete  descriptions  of 


situations. 


It  is  for  these  reasons  that  this  thesis  has  emphasized  both  representational  and 
procedural  issues.  We  have  tried  to  increase  the  range  of  facts  that  an  A I  system  can 
describe,  while  at  the  same  time  giving  the  system  some  degree  of  competence  in  reasoning 
with  those  descriptions.  Much  more  progress  will  be  required  before  AI  systems  approach 
the  common-sense  reasoning  abilities  of  humans.  I  hope  that  the  research  reported  here  is  a 


step  in  that  direction. 
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Appendix  A:  First-Order  Axioms  for  Knowledge  and  Action 


Li.  Vp,(Truo(p,)  ■  T(WQtpj))  *0 

L2.  Vw|,p],P2(T(wi^nd(p|,P2»  *  (T(W|,p|)  A  T(w|^2)»  40 

L3.  Vw|,P|,p2(T(wj,0r(pj,P2))  ■  (Tiwj.pj)  v  T(wj,p2)))  *0 

L4.  Vw,  ,p, ,p2(T(w| ,(p ,  •>  p2»  *  (T(wj^j)  a  Kw,^)))  *° 

L5.  Vw1,p1,p2(T(wj,(pj  <■>  p2»  ■  ■  T(wj,p2)))  *1 

L6.  Vw, ,p,  (T(wj  ,Not(p| ))  ■  *T(wj,pj ))  *1 

L7.  Vw,  <T(w,  lExict(7Sj>P))  ■  3sj(T<w, ]»)  90 

L8.  Vw,  (T(w,  ,AII(7Sj,P))  •  V«;(T(w,  .PfoUjJ/TS;])))  90 

L9a.  Vw,  ftrro,  ,...,trmn(T(w,  ,P(trm ,  ^..,trmn))  »  H(w,  ,:P(D(w,  ,trm,  i^^Otw,  ,trmn))))  93 

if  P  is  not  an  essential  property  of  the  things  it  is  true  of. 

L9b.  Vw,  ,lrm,  ,...,trmn(T(wj  ,P(trm,  ,...,trmn»  »  tP(D(w,  ,lrm,  W>(w, ltrmn)))  93 

if  P  is  an  essential  property  of  the  things  it  is  true  of. 

L10.  Vw,fx,(D(w,ls(x,»  ■  x, )  94 

LI  la.  Vw,  (0(w,,Cntl)  ■  V(w,,:Cnst))  if  Cnst  is  not  a  rigid  designator.  94 

Li  lb.  Vw,  (Dlwj.Cnst)  ■  iCnst)  if  Cnst  is  a  rigid  designator.  94 

L12a.  Vw,ltrm,l...ltrmn(0(w,,F(trm,r^trmn)i  ■  V(w,,sF(D(w,ltrm,)l^(wj,trmn)))  95 

if  F  is  not  a  rigid  function. 

L12b.  Vw,ltrm,,...ltrmn(0(w,,F(trm,(..1trmn))  ■  :F(D(wj,irm,)^.,D{w,,trmft))'  95 

if  F  is  a  rigid  function. 

L13.  Vw,ltrm,,trm2(T(w,,Eq(trm,,trm2»  ■  (0(w,,trm,) »  Dfwj'trmg)))  95 

Kl.  Vwjl»rm.sj,p,(T(wjtKnow(trm.aj^j))  a  Vw2(K{D(w,,trm.a,),w,,w2)  a  T(w2»P,)))  81 

K2.  Va,  ,w,  (K(a,  ,w,  ,w, ))  81 

K3.  Va,  ,w,  ,w2(K(a,  ,w,  ,w2)  3  Vw3(K(a,  ,W2,w3)  a  K(a,  ,w,  fw3)M  *2 

Rl.  Vw,,trm.ov,,p,  101 

(T(w,  ,Ra«(irm.av,  ,p, ))  a  3w2(R(D(w,  ,trm.av,  ),w,  ,w2)  A  T(w2,p,))) 
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R2.  Vwj,trm.a|ltrm.act|,trm.act2»P] 

103 

(T(wj1R«(Do(trm.ij,(trm.*etj;  trm.act2))^| ))  ■ 

T(w  j  ,R«s(Oo(trm.«j  .Irmactj  ),R**(Do(«(D(wj  )),tr«iiacl2)*j »» 

R3.  Vw  j  ,irm.a  j  ,trm.act  j  ,1rm.*ct2,pi  ,p2 

103 

s 

(T(w  j  ,Ra*(Do(trm.a,  ,lf(p|  ,trm.ae»|  ,trm.act2)),p2))  ■ 

((T(wjtp|)  A  T(w|>RM(Do(trm.a1,trm.ad])^2)))  V 
(~T(Wj,pj)  a  T(w|^M(Do(trm.a|,trm.ad2)^2))))) 

R4.  Vw],trm.a|,trm.act|,»|^2 

103 

(T(wj  ^•*(Do(trm.i!  ,Whil«(pj,trm.act|  )),p2))  ■ 

i 

T(w|  ^••(DoOrm.aj  ,Jf(p,  ,(trm.act  j ;  Whilatpj  ,trm.aetj  ))VNU))^2)» 

> 

R5.  Vtrm.aj,Wj,w2(R(Oottrm.aj,Nil),W|pw2)  ■  (wj  ■  w2)) 

103 

R6.  Vw|,trm.«V|,p| 

104 

(T (w  j  ,R«*  1  (trm.av j  ^ j ))  ■  Vw2(R(0{wj  4r«tav j),wj,w2)  a  T(w2,P|))) 

PI.  Va|^lfx2,W|,w2 

105 

- 

Pw2(R(:Do(aj  ,:Puton(x|  px2)),w,  ,w2»  ■ 

s 

((:BlocK(xj)  A  Vx3(»H(w j  ,K)n(x3lx j )))  A 

«X]  /  x2)  a  Vx3(-H(w|lK)n(x3lx2))))  V  iTabia(x2)))) 

105 

P2.  Va1,x1,x2,wllw2 

(R(:Do(a-  ,:Puton(xj ^2),wj ,w2)  a 

• 

(H(w2,:On(x|px2))  A  Vx3((x2  f  x3)  a  -H(w2,:On(X].x3))))) 

■ 

P3.  Va),x1,x2lwllw2 

105 

(R(:Oo(a  j  ,:Puton(x  j  ,x2),w  |  ,w2)  a 

■ 

(Vint.trmj  (V(w|,int.trm|)  ■  V(w2,int.trmj))  A 

Vint.pj  (Vx3(int.p|  /  ri)n(xj,x3))  a  (H(w^nt.pj)  ■  H(w2,int.pj ))))) 

P4.  Va,(x,,x2,w,,w2 

US 

• 

(R(:Do(»,,iPuton(x|^2)),W|,w2)  a 

Vw3(K(aj ,w2,w3)  •  3w4(K<aj,Wj,w4)  A  R(:Do(a],:Puton(xjP«2)),W4,w3)))) 

Cl.  Vw  j  j  ,t  r  m.act  j  ,p  j 

112 

(T(wj  ,Know(trm.a  j  ,And(Eq{fi(D(w]  ,trm.Kt  j  )),trm.act  j ), 

• 

Ras(Do(«(D(w|  ,lrm.aj  )),trm.ac» j  >,p j )»)  a 

T(wj  ,Can(trm.a|  ,trm.act|  ,pj ))) 

C2.  Vw  j  ,lrm.a  j  .trm.act]  ,trm.act2,pj 

112 

(T (w |  ,Can(trm.a j  ,(trm.act  j ;  trm.act2),P| })  ■ 

• 

t 

• 

►  - 1 

» 

T(w  j  ,Can(trm.a|  ,trm.act|lCan(a(0(wj  )),trmjct2,pj )))) 

> 

1 

229 


C3.  Vwj  .trm.aj  ,trm.«ct2iP]  ,P2 

(T(wj  .Can  (trm.aj ,lt(p  j ,trm.aetj ,trm.act2),P2»  “ 
((T(wj,Know(lrm.«|4>j))  a  T (w j ,Can(trm.a j ,trm.»et j iP2^)  v 
(T(wi,Know(lrm.«|Not{p|)))  a  T<wj,Can{trm.«i,trm.act2,P2>»» 

C4.  Vwj  ,trm.aj  ,trm.actj  ,pj  ,p2 

(T(w j  ,Can(trm.a j  ,Whila(p j  ,trm.act  j  ),p2))  * 

T(wj,Can(trm.aj ,H(pj ,(trm.act  j ;  Whilafpj ,tr .actj 


01.  Vaj.xj.xj.wj 

Pw2(R(:0oU  i  ,:Dial(x  j  tx2>)-wl'w2))  ■ 

Ow3(x|  ■  V(w3,!Comb<x2»)  a  :S»af#(x2)  A  H(wj ,iAt(aj ,X2))» 

02.  Va1pti,x2,wl,w2 

R(:Do(aj  ,:Dial(xj  pt2»»wj  ,w2)  3 

(((xj  •  V{wi,sComb(x2)))  3  H(w2,i0pan{x2»)  A 

(«X|  i  V(w(l^omb{x2>»  a  -H(wj,K)panlx2)))  » -H(w2,K)pan(x2)))  a 
(H(w  j  ,:0p«n(x2))  =»  H(w2,:0pan(x2))») 

03.  VaiptltX2,Wj,w2 

(R(:Do(a  j  ,:Dial(xj  (X2)),Wj  i*2)  3 
Vw3(K(a1,w2,w3)  ■  ((H(w2,:0p«n(x2))  ■  H(w3,K)p#n(x2>»  * 
3w4(K(a,lwi,w4)  a  R(:Do(a j ,:0ial(xj tX2))»w4,w3))))) 

04.  Vaj,xjtx2,Wj,W2 

(R(:0o(a  j  ,:Dial  (x  j  ,x2)),w  j  ,w2 )  3 
(Vint.trmj (V(wj,int.trmj )  ■  V(w2,inWrmj))  A 
Vint.pj  ((int.pj  t  :0p«n(x2))  3  (H(wj,int.pj)  ■  H(w2MP|)»» 

ABV1.  Vwi,trm.X|lJrmJ«2 

(T(wj  ,Abova(trm.x j  ,trm.x2))  ■ 

CT(w|,0n(»rm.Xj,tri«J{2))  v 

3x3(T(w  ,  Abovadrmx  j  ,o(x3)))  a  T(wj  Abova(e(x3),trmJt2))))) 

Al.  Vwj  ,aj  ,x j  (H(W|  ,:AI(a j  pc] ))  »  Vw2(K(al,W|,w2)  3  H(w2,*At(aj,Xj)))) 
INF1.  Vwj.trm.xi.axpj 

(T(wj,lnlo(trmj«j,axpj))  ■  (axpj  ■  V(wj,:lnfo(D(wj,trmJ«|))))) 

R01.  Vaj,xj,wj,W2 

pw2(R(!D<»(a,  ,sR**d{xj  )),Wj  ,w2))  ■ 

(H(wj,:Raads(aj))  a  H(wj,^t(ajA|)))> 

R0S1.  Vwj  ,aj  (H(wj  .tRaadcUj ))  3  VwjMaj.w,^)  3  H(w2.»R««Wa,  »» 
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Appendix  B:  Procedurally  Interpreted  Axiom*  for  Knowledge  and  Action 


LI.  Tru*(pj)  <■>  T{Wq,Pj)  156 

L2.  T(wj,(And(pj,p2))  <*>  CT{wj,p|)  A  T(wj,p2M  15* 

L3.  T(w, i(0r(p|,p2)»  <■>  (T{w, ,p, )  v  T(w, ,p2»  1 56 

L4a.  T(W|,(p|  ->  p2)>  <■>  (T(wj,pj)  «>  T(wj,p2))  156 

L4b.  T(wj,(pj  <-  p2»  <■>  (T(wj,pj)  <-  T(w],p2))  156 

L4«.  T(wj,(p|  ■>  p2»  <■>  (T(wj4>j)  ■>  T(wj,p2))  156 

L4d.  T(wj,(pj  <■  p2))  <■>  (T(W],p|)  <•  T(w|,p2))  156 

L5a.  T(wj,(pj  <■>  p2))  <»>  (T(wj,pj)  <■>  T(wj»p2))  156 

L5b.  T(w|,(p|  <■>  p2))  <*>  (T(wj,pj)  <■>  T(wj,p2))  156 

L6.  T(w,  tNot(p, ))  <■>  •>T(wj ,p, )  1 56 

L7.  T(wlpExist{TSj,P))  <*>  3«j(T(w1,P[n(*i)/fSj]))  156 

L«.  T(w  |  ,AII (?$}•?))  <■>  V«j(T(W|lP(*{*i)/7Sj]»  156 

L9a.  T(wj  ,P(trmj  ,...,trmn))  <•>  H(wj  ,:P(0(wj  .trmj  ,trmn)))  1 56 

if  P  is  not  an  essential  property  of  the  things  it  is  true  of. 

19b.  T(Wj,P(trmj(...,lrmn))]  <»>  :P(D(wj,trmj>,.,.,0(wj,trmn))  156 

if  P  is  an  essential  property  of  the  things  it  is  true  of. 

LlOa.  Dfwj'tttxj))  »  X|  156 

LlOb.  (fi(xj )  -  e(x2»  <•>  (x,  •  x2)  156 

LI  U.  D(wj,Cnst)  •  V(wj,sCns»)  if  Cnst  is  not  a  rigid  designator.  156 

LI  lb.  0(w|,Cnst)]  ■  iCnst  if  Cnst  is  a  rigid  designator.  156 

L12«.  D(wj ,...,irmn))  »  V(wj,:F(D(Wj,irmj)^,D(Wj,trmn))  156 

if  F  is  not  a  rigid  function. 

LI 2b.  D(w j  ,F(trmj ,...,trmn))  ■  iF{D(wj ,trmj j 
if  F  is  a  rigid  function. 


L 


157 


L13.  T(W|  ,Eq(trm  j  ,trm2))  <■>  (D(W|,»rmj)  ■  Dfwj.trn^)) 

Kl.  T(W| ,Know(trm.a j ,pj ))  <■>  Vw2(K(D(w1,trm.a1),wj,w2)  ->  T(w2,Pj» 

K2.  K(«|,W|tW|) 

K3.  K(«j,wj,w2)/twj  /  w2J  •>  (K(«|.w2iw2)/Iw2  f  WjJ  -> 

R1 .  T(W]  ,Raa(trm.avj  ,pj ))/ 

[(trm.av  j  f  Oo(trm.aj,(trm.act2t  trm.aet3)))  A 
(trm.av  j  /  Do(trm.aj  ,lf  (p2,lrm.act2ltrm.act2)))  A 
(trm.av  j  /  Do(trm.ajlWhila(p2ltrm.act2))]  <■> 

3w2(R(D(wj,Do(trm.aj,trm.actj)),wj,w2)  A  T(w2,pj)) 

R2.  T (w  |  ,Ras  (Do  (trm.a  j  ,(trm.act  j  5  trm.act2)),P|))  <■> 

T(w]  ,R*«(Oo(trm.«|  .trm.actj  ),Ra«(Do(o(D(wj  ,trm.aj  )),trm.act2),P] ))) 

R3.  T(wjlRa«(Do(trm.ajllf(p|,trm.actj.trm.act2))lp2))  <■> 

((T(w|,p|)  a  T(W| ,Raa  (Do  (trm.a  j  .trm.actj ),p2)))  v 
(-T(w],p|)  a  T (w |  ,R«*(Do(trm.a j  ltrm.ad2)lp2)))) 

R4.  T(W|lRas(Do(trm.ajlWhila(pjltrm.actj))lp2))  <■> 

T(w j ,Ras(Do(trm.aj ,l((p j,(trm.ac»j  j  Whila(pj  .trmjct  j ))rNil})lp2)) 

R5.  R(Do(trm.a j  ,Nil),w j  ,w2)  <■>  (w|  •  w2) 

R6.  T(wj  ,Rasl  (trm.av  j  ,pj ))/ 

[(trm.av  j  /  Do(trm.aj,(trm.act2;  trm.act3)))  A 
(trm.av j  /  Do(trm.a  j  ,H(p2,trm.act2, trm.actj)))  A 
(trm.av |  /  Do(trm.a|fWhila(p2,trm.act2))]  <»> 
Vw2(R(D(wj,Do(trm.aj, trm.actj)), wj,w2)  ->  T(w2,pj)) 

Cl.  T(wj ,Can (t r m.a j ,t r mact j  ,p  j ))/ 

[(trm.actj  /  (trm.act2;  trm.actj))  A 
(trm.actj  /  lt(p2, trm.actj, trm.actj))  A 
(trm.actj  t  Whila(p2, trm.actj))]  <■ 

T(w  j  ,Know(trm.aj  lAnd(Cq(a(0(W|  ,trm.Kt  j  )),trm.aet  j ), 
Rac(Oo(fi(D(wj  ,trm.aj  )),trm.act  j  ),pj ))))] 

C2.  T(w|,Can(trm.a|,(trm.act|{  trm.act2),pj))  <■> 

T(wj  ,Can(trm.aj  .trm.actj  .Can(fi(D(wj  .trm.aj  )),trm.act2,pj ))) 

C3.  T(wj,Can(trm.aj,lf(p|ltrm.act|,trm.act2)^2))  <■> 

((T(W|,pj)  A  T(wj,Can(trm.aj,trm.actj,p2)))  v 
(-T(wj^j)  a  T(wj •Candrm.a j ,trm.act2,p2)))) 
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C4.  T(wjlCan(trm.a|lWhila(pjltr.act|)lp2))  <■> 

T(wj  ,C«n(trm.a j  ,lf(pj  ,(trm.act j ;  Whila(pj  ,tr jet j 

Ola.  ROOoCajttDiaKxj^^W]^)  ■> 

(3w3(V(w3,:Co<nb(x2))  •  xj)  A  :Sa(«(x2)  A  H(w1,iAt(a}lx2))) 

01  b.  R(:Do(a j  ,:Dial(xj  ,*2)),W|  fl  f*I  •*!  **2'wl  W  <m 

((V(w3>:Comb(x2)  ■  X|)  A  :Sa(a(x2)  A  H(wlr^t(aj,x2))) 

02.  R(t0o(ajt:0ial(X]fX2)),W|lW2)  -> 

«H(w2,inl.p j )  <■> 

(((int.pj  •  :Op«n(x2))  A 
((V(w j ,:Comb(x2))  ■  Xj)  v  H(wj,tOpan(x2))))  V 
((int.pj  i  tOpan(x2))  a  H(Wj,lntpj»))  A 
(V(w2,int.trfflj )  •  V(wj,int.trmj)}) 


03.  R(:Do(a  j  ,:Dial(X|  ,x2)),wj  ,w2)  -> 

(K(aj,w2,w3)/[w2  /  w3]  <-> 

Pw4(K(aj ,wj  ,w4)/[wj  t  w4]  A 
R(:Do(a j  ,:Dial(x j  ,x2)),w4,w3))  A 
(H(w2,:Op«n(x2))  <■>  H(w3,:Op«n(x2))))) 

At.  K(aj ,Wj ,w2)/[wl  f  w2]  -> 

(H(w2l:At(aj|X|))  v  -H(w j  v:At(«|  pij ))) 

INF1.  T(wjllnfo(trm.xjlaxp]))  <■>  (V(wjl:lnfo(D(wjltrmjC|)))  ■  axpj) 

RDSl.  K(aj  ,wj  ,w2)/[wj  /  w2]  •> 

(H(w2,iRaadc(aj))  v  -H(wj,iRaadi(aj))) 

ROIa.  R(:Oo(a|f:Raad(xj))>W|fw2)  ■> 

(H(wj,tR«ads(aj))  a  H(wj,:At<a|lxj))) 

RDlb.  R(:Oo(aj,:R«a<l(x|)),W|/2(a|,xj,W|))  <• 

(H(W|,tR«adt(a|))  a  H(wj,tAt(aj,X|)» 

R02.  R(:Oo(a|,:Raad(X|)),Wjlw2)  *> 

(K(a,,w2,w3)/[w2  /  w3]  <-> 

3w4(K(aj  ,wj  ,w4)/[wj  i  w4]  A 

(V(w4,:lnfo(xj))  a  V(w,,:ln(o(xj)))  A 
R(:Oo(aj  ,:R«ad(x  j  )),w4,w3))) 

RD3.  R(:Do(a|,:Raad(x|)),Wj,w2)  •> 

((V(w2lint.trmj )  a  V(wj,int.trmj))  A 
(H(w2, int.pj)  <■>  H(wj^nipj)) 
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